Lake Victoria’s 2020 Flood Event Analysis

logo

Please note that this repository is used for development and review, so quality assessments should be considered work in progress until they are merged into the main branch

1.5.1. Lake Victoria’s 2020 Flood Event Analysis#

Production date: 28-05-2025

Produced by: Amaya Camila Trigoso Barrientos (VUB)

🌍 Use case: Definition of the 2020 Lake Victoria flood event through statistical analysis#

❓ Quality assessment question#

Can extreme value analysis be applied to detect, define and quantify Lake Victoria’s 2020 Flood Event using the LWL v5.0 lake water level dataset?

In 2020, Lake Victoria experienced an extreme flood event with significant regional impacts. Accurate definition and characterization of such events are essential for understanding their causes and informing adaptation strategies. This notebook explores whether the satellite-lake-water-level (C3S-LWL v5.0) dataset, a freely available and harmonized satellite-derived product, is suitable for defining high-impact flood events through statistical analysis.

Inspired by methodologies such as those applied by Pietrosanti et al. using the DAHITI dataset, this analysis applies extreme value analysis (EVA) to assess the 2020 flood event. The goal is to evaluate whether C3S-LWL v5.0 can serve as a reliable resource for event definition in impact attribution workflows, particularly in data-scarce regions like the Lake Victoria basin.

📢 Quality assessment statement#

These are the key outcomes of this assessment

The 2020 event ranked as the third-highest 180-day lake level rise in the historical record (1948–2023), with only 1962 and 1998 recording greater increases.
The magnitude of the 2020 rise was 1.21 meters, identical in both the C3S-LWL and DAHITI datasets, strengthening confidence in the validity of satellite-based lake level measurements.
From 2016 onward, improved satellite coverage has led to strong agreement between C3S-LWL and DAHITI, demonstrating the potential of the C3S-LWL v5.0 dataset for reliable use in impact atribution, long-term hydrological analyses and flood risk projections.

📋 Methodology#

The analysis and results are organised in the following steps, which are detailed in the sections below:

1. Data request and download

Download C3S–LWL v5.0 satellite-lake-water-level data for Lake Victoria.

2. C3S comparison DAHITI

Load and preprocess DAHITI data.
Plot monthly count of data recorded for both the DAHITI and C3S data.
Plot C3S and DAHITI’s Water Level and show the maximum difference.

3. Time series reconstruction

Load and preprocess HYDROMET data.
Calculate the average difference between HYDROMET and C3S-LWL overlapping data and correct the HYDROMET dataset based on this.
Interpolate to fill missing gaps in HYDROMET.
Plot reconstructed time series.

4. Probabilistic extreme event analysis

Apply the annual block maxima to the reconstructed time series.
Compare with the results from Pietroiusti et al. (2024) [1].

📈 Analysis and results#

1. Data request and download#

Import packages#

Import the packages to download the data using the c3s_eqc_automatic_quality_control library.

Set the data request#

Set the request for the specific lake (Victoria in our case) analyzed and the collection id (satellite lake water level).

Download data#

2. C3S comparison DAHITI#

Load and preprocess DAHITI data#

This assesment will apply the methods developed by Pietroiusti et al. (2024) [1]. On that paper, the DAHITI dataset [2] was employed. Therefore, a comparison of water level of lake Victoria data in C3S-LWL v5.0 and in DAHITI will be carried out.

Monthly temporal completeness#

../../_images/017e0bfa-a444-4633-bf0a-7c2e4daf6232.png — Fig. 1.5.1.1 Comparison of count of monthly values over time available in the C3S-LWL v5.0 dataset (on the left) and in DAHITI (on the right).#

The DAHITI dataset exhibits a relatively consistent number of monthly observations, typically ranging from 1 to 4 throughout its entire temporal span. In contrast, the C3S-LWL v5.0 dataset shows variability in the number of monthly values, depending on the availability of satellite missions over time. Up until the end of 2010, only one value per month was recorded. From 2011 to 2015, this number increased to mostly three observations per month, likely reflecting additional unfiltered data from the Jason-2 mission. A significant increase in observation frequency is evident from 2016 onward, corresponding to the introduction of Jason-3 and Sentinel-3A satellites. The C3S-LWL v5.0 dataset contains only one gap in 2006, indicating strong overall temporal completeness.

This can be explained by the fact that DAHITI applies interpolation using a Kalman filter, a statistical technique that predicts the water level at the next time step based on previous observations and associated uncertainties [1]. This allows DAHITI to maintain a nearly uniform temporal resolution, achieving approximately three observations per month even during periods with limited satellite overpasses. On the other hand, the C3S-LWL v5.0 dataset does not interpolate. It provides water level observations derived from satellite altimetry at the actual times satellites pass over a lake. According to the Algorithm Theoretical Basis Document (ATBD), C3S applies strict data filtering based on two key criteria: (i) removal of measurements with invalid (NaN) geophysical corrections, and (ii) exclusion of data with backscatter coefficient (sigma0) values indicating land contamination. Consequently, more data from earlier missions is filtered out due to the lower quality of older satellite sensors and ground processing systems, which improved significantly in later years.

Plot C3S and DAHITI’s Water Level#

../../_images/f53887f6-cfee-4af8-90b7-2fc3eb54b191.png — Fig. 1.5.1.2 Lake Victoria’s water level from 1992 to present from C3S-LWL v5.0 and DAHITI#

Average difference for 1992-2002 (TOPEX/Poseidon): 0.332 m
Average difference for 2002-2016: 0.042 m
Average difference for 2016-2023: 0.083 m

The difference in water level estimates for Lake Victoria between the DAHITI and C3S-LWL v5.0 datasets varies noticeably over time (see Fig. 1.5.1.2). A larger discrepancy is observed during the first decade of the time series (1992-2002), when the only satellite data available originated from the TOPEX/Poseidon mission. During this period, the average difference between the two datasets is 33.2 cm in average, with C3S values consistently higher than those from DAHITI.

This early discrepancy appears to stem from differences in the algorithms and retracking methods used by the two datasets. DAHITI, in some cases, applies an improved 10% threshold retracker for better perfomance on inland water bodies, depending of the especific case. However, specific information on whether the 10% retracker was applied to Lake Victoria could not be confirmed, since available documentation only details its application forlakes in the Americas. Additionally, DAHITI employs a Kalman filter to interpolate and smooth its time series, further influencing the results [2]. In contrast, according to the ATBD, the C3S-LWL v5.0 dataset uses a ocean model retracking for large lakes such as Lake Victoria

However, this difference diminishes noticeably after 2002 (less than 10 cm in average), as newer satellite missions introduced more frequent and accurate measurements. The influence of processing seems to be much smaller in later years, likely because the improvements in sensor technology and data availability reduce the impact of such methodological differences.

These findings are supported by the PQAR, which includes Lake Victoria under the Southern Africa region. According to the PQAR, the Pearson correlation coefficient between DAHITI and C3S in this region is very close to 1, and the Unbiased Root Mean Square Error (URMSE) remains below 25%, indicating strong overall agreement in this region.

3. Time series reconstruction#

The code on this section is based on the work of Pietroiusti et al. (2024) [1].

Load and preprocess HYDROMET data#

From January 1, 1948 to August 1, 1996, daily in situ water level measurements at the Jinja station were obtained from the WMO Hydrometeorological Survey (hereafter referred to as HYDROMET) [3]. These measurements, originally recorded as water depth at the lake’s outflow, were later converted to meters above sea level by adding a geoid correction of 1122.887 m, following the approach of Vanderkelen et al. (2018) [4].

	water_level	meas
date
1948-01-01	1134.097	11.210
1948-01-02	1134.102	11.215
1948-01-03	1134.062	11.175
1948-01-04	1134.052	11.165
1948-01-05	1134.077	11.190
...	...	...
1996-07-28	1134.777	11.890
1996-07-29	1134.757	11.870
1996-07-30	1134.752	11.865
1996-07-31	1134.717	11.830
1996-08-01	1134.747	11.860

17746 rows × 2 columns

Merging the datasets together#

../../_images/9d1e4010-dcda-47ab-9103-beea2c887763.png — Fig. 1.5.1.3 HYDROMET (1948-1996) and C3S (1992-2023) datasets timeseries before correction.#

To obtain the reconstructed dataseries of Lake Victoria’s water levels it is necessary to perform a correction to HYDROMET data based on the overlaping period (1992-1996).

	HYDROMET	C3S
date
1992-09-28	1134.567	1135.32
1992-10-18	1134.572	1135.30
1992-11-15	1134.597	1135.36
1992-12-16	1134.672	1135.43
1993-01-14	1134.767	1135.54

avg_diff = 0.7777871747927428 n_diff = 47 std_diff = 0.02967133397421761
avg_diff_r = 0.778 std_diff_r = 0.03

The mean difference between the two datasets is 77.8 ± 3.0 cm, while the average offset between HYDROMET and DAHITI was 43.2 ± 5.24 cm [1].

../../_images/19d08709-ecfe-4d73-aa3e-0170ad23724d.png — Fig. 1.5.1.4 HYDROMET (1948-1996) and C3S (1992-2023) datasets timeseries after correction.#

A single dataset of with both sourceds was created. The when C3S LWL-v5.0 was available it was replaced and the gaps in HYDROMET filled by interpolation

../../_images/dfd3858a-2054-4ce8-b334-2e2451c1b6b4.png — Fig. 1.5.1.5 Reconstructed Lake Victoria’s water levels timeseries (1948-2023).#

4. Probabilistic extreme event analysis#

Pietroiusti et al. (2024) [1] assessed the influence of anthropogenic climate change on the 2020 Lake Victoria floods using the extreme event attribution framework outlined by Philip et al. (2020) [5] and van Oldenborgh et al. (2021) [6]. The methodology follows a structured sequence of steps: (i) defining the event, (ii) estimating probabilities and trends based on observational data, (iii) validating climate models, (iv) conducting attribution using multiple models and methods, and (v) synthesizing the findings into clear attribution statements.

In this assessment, only the first step, event definition, is performed, using reconstructed Lake Victoria water level time series from the HYDROMET dataset and the C3S LWL-v5.0 product. The aim is to assess whether the C3S LWL-v5.0 dataset is suitable for use in attribution analyses of this kind. While the remaining steps involve model simulations and additional data sources that are beyond the scope of this work, the assumption is that if the observational component is consistent, the full attribution framework could, in principle, be applied using this dataset as well.

Event definition#

The varaible chosen for analysis is rate of change in water levels (ΔL) over a tme-window (Δt). The Δt chosen was 180 days since most of the level rise in the flooding event of 2020 in Lake Victoria ocurred in the 6 months period between November 2019 and May 2020 [1].

The annual block maxima was calculated to retain the maximum ΔL over a 180 days ocurred in each year.

🔹 2020 ΔL/Δt (180 days): 1.210 m
🔹 2020 Rank: 3 out of 74 years

../../_images/5fede85b-25bb-4126-9f51-b1050a26ae3a.png — Fig. 1.5.1.6 Annual block maxima time series \((\Delta L / \Delta t)_{\text{max}}\) with Δt=180 d for the period 1897–2021 and 10-year rolling mean of the time series (using C3S-LWL v5.0 data).#

../../_images/e4f5e3f5-af36-4613-9df5-03a4f99fde20.png — Fig. 1.5.1.7 Annual block maxima time series \((\Delta L / \Delta t)_{\text{max}}\) with Δt=180 d for the period 1897–2021 and 10-year rolling mean of the time series (using DAHITI data). Source: Pietroiusti et al. (2022) [7]#

The results of this analysis are consistent with those reported by Pietroiusti et al. (2024) [1]. According to the C3S-LWL v5.0 dataset, the year 2020 recorded the third-largest 180-day lake level rise, surpassed only by 1962 and 1998. The magnitude of the 2020 rise was 1.21 meters, which matches the value obtained by Pietroiusti et al. using the DAHITI dataset.

However, some discrepancies emerge when comparing other years. For instance, in 2000, the C3S-LWL v5.0 data shows a negative value, indicating a consistent decrease in lake levels throughout the year (see Fig. 1.5.1.6). Conversely, the corresponding figure in Pietroiusti et al. (2022) [7] (see Fig. 1.5.1.7) shows a slightly positive value.

These differences likely stem from variations in the underlying datasets, as discussed in 2. C3S comparison DAHITI. In 2000, TOPEX/Poseidon was still the only mission providing data for the C3S-LWL, with only one value per month. Meanwhile, DAHITI applied Kalman-filtered interpolation, resulting in a denser time series. Because no reliable in situ data are available for this period, it is difficult to determine which dataset more accurately reflects reality.

Nevertheless, the agreement between both datasets on the 2020 block maximum strengthens confidence in the reliability of this result. Although C3S-LWL and DAHITI use different processing algorithms, the increased availability of satellite observations in recent years (particularly after 2016) has led to more similar their outputs. With more frequent and higher-quality measurements from multiple missions, the differences between the datasets diminish, making the 1.21 meter increase observed in 2020 not only consistent across sources, but also giving more confidence in the result.

Furthermore, the C3S-LWL v5.0 dataset could potentially be used to support future hydrological analyses and projections [9], particularly in regions where in situ data are limited or unavailable

ℹ️ If you want to know more#

NASA (2021). Lake Victoria’s Rising Waters
ACSA Uganda & Uganda Coalition for Sustainble Development (UCSD) (2020). The Implication of Floods to Food Security During and the Aftermath of COVID-19 Pandemic in Uganda

Key resources#

Code libraries used:

C3S EQC custom functions, c3s_eqc_automatic_quality_control, prepared by B-Open

Dataset documentation:

References#

[1] Pietroiusti, R., Vanderkelen, I., Otto, F. E. L., Barnes, C., Temple, L., Akurut, M., Bally, P., van Lipzig, N. P. M., and Thiery, W. (2024). Possible role of anthropogenic climate change in the record-breaking 2020 Lake Victoria levels and floods, Earth Syst. Dynam., 15, 225–264.

[2] Schwatke, C., Dettmering, D., Bosch, W., and Seitz, F. (2015). DAHITI - an innovative approach for estimating water level time series over inland waters using multi-mission satellite altimetry: , Hydrol. Earth Syst. Sci., 19, 4345-4364.

[3] WMO-UNPD (1974). Hydrometeorological Survey of the Catchments of Lakes Victoria, Kyoga and Albert: Vol 1 Meteorology and Hydrology of the Basin.

[4] Vanderkelen, I., Van Lipzig, N. P., and Thiery, W. (2018). Modelling the water balance of Lake Victoria (East Africa)-Part 1: Observational analysis. Hydrology and Earth System Sciences, 22(10):5509–5525.

[5] Philip, S., Kew, S., van Oldenborgh, G. J., Otto, F., Vautard, R., van der Wiel, K., King, A., Lott, F., Arrighi, J., Singh, R., and van Aalst, M. (2020). A protocol for probabilistic extreme event attribution analyses, Adv. Stat. Clim. Meteorol. Oceanogr., 6, 177–203.

[6] G. J. van Oldenborgh, K. van der Wiel, S. Kew, S. Philip, F. Otto, R. Vautard, A. King, F. Lott, J. Arrighi, R. Singh, and M. van Aalst. Pathways and pitfalls in extreme event attribution. Climatic Change, vol. 166, no. 1, p. 13, 2021.

[7] Pietroiusti, R., Vanderkelen, I., van Lipzig, N. P. M., and Thiery, W. (2022). Was the 2020 Lake Victoria flooding ‘caused’ by anthropogenic climate change? An event attribution study. M.Sc. thesis, Dept. of Hydrology and Climate, Vrije Universiteit Brussel and KU Leuven.

[8] Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics. Springer-Verlag London. ISBN: 978-1-85233-459-8.

[9] Vanderkelen, I., van Lipzig, N. P. M., and Thiery, W. (2018). Modelling the water balance of Lake Victoria (East Africa) – Part 2: Future projections, Hydrol. Earth Syst. Sci., 22, 5527–5549.