Uncertainty in drought indicators for national monitoring

logo

Please note that this repository is used for development and review, so quality assessments should be considered work in progress until they are merged into the main branch

6.1. Uncertainty in drought indicators for national monitoring#

Production date: 2026-02-26.

Please note that this repository is used for development and review, so quality assessments should be considered work in progress until they are merged into the main branch.

Dataset version: 1.0.

Produced by: Enis Gerxhalija, Olivier Burggraaff (National Physical Laboratory).

🌍 Use case: Monitoring drought on a national scale using reanalysis-based drought indicators#

❓ Quality assessment question#

What is the ensemble spread for drought indicators in the ERA5–Drought dataset and how does this propagate into drought monitoring levels?
Can the ensemble in ERA5–Drought provide additional information for operational drought monitoring, compared to only using the deterministic reanalysis?

Drought and wet periods have far-reaching environmental, societal, and economic impacts. In the United Kingdom, the record-breaking hot and dry spring and summer of 2025 caused harvest losses worth more than £800 million [ECIU+25]. An 18-month drought in Brazil in 2023–24, the most severe since monitoring began in 1954, led to 720 health centres in affected areas becoming non-operational [UNICEF+24]. Extreme rainfall killed hundreds of people and caused billions of € in damages in Spain in 2024 [Franch-Pardo+25]. Climate change is thought to be the primary driver behind the increase in drought and wet periods since the 1950s [IPCC+23], and this trend is expected to continue into the future.

Given these impacts, monitoring drought and wet periods is vital. Several quantitative proxies have been developed to aid in this monitoring, generally based on a combination of total precipitation, soil moisture, evapotranspiration, and surface (air) temperature. Two widely-employed indices are the Standardised Precipitation Index (SPI) [McKee+93] and Standardised Precipitation-Evapotranspiration Index (SPEI) [Vicente-Serrano+10]. Both operate on the same principle, namely quantifying the amount of precipitation over a given time frame at a given location relative to its historical climatology. For example, an SPI value of –1 corresponds to a precipitation that is 1 standard deviation below the mean, for that site and time frame. This probabilistic approach lends itself to statements on the occurrence rate of extreme events [McKee+93]. SPEI expands on SPI by including not just gains from precipitation, but also losses due to evapotranspiration. SPI and SPEI can be evaluated over different accumulation periods to probe phenomena at different time scales [Keune+25, EDO+25]:

1, 3 months: soil moisture, flow in smaller creeks.
6, 12 months: reservoir storage, stream flow.
24, 36, 48 months: reservoir and groundwater recharge.

ECMWF now provide SPI and SPEI indices derived from their fifth-generation reanalysis ERA5, which assimilates meteorological data and models on a global ~31 km grid going back to 1940 [Soci+24, Hersbach+20], in the ERA5–Drought dataset [Keune+25], available from the Climate Data Store (CDS). ERA5 is a well-established dataset, widely used across many sectors including drought monitoring and forecasting [Evenflow+24]; quality assessments for ERA5 itself can be found in Reanalysis. The derived dataset ERA5–Drought can be a valuable resource for applications in many sectors, since it can be used out of the box, freeing users from the need to find and process the underpinning meteorological data themselves. ERA5–Drought provides monthly SPI and SPEI for 7 different accumulation periods, interpolated to a 0.25° × 0.25° grid.

A key strength of ERA5 and ERA5–Drought is the inclusion of an ensemble that can be used to estimate (part of) the uncertainty in the data. Its 10 members are generated by running the ERA5 data assimilation and forecast model at a coarser resolution, 9 of them with perturbed inputs representing the input data uncertainty, and one control member with the same inputs as the higher-resolution deterministic reanalysis to provide a representative comparison. ECMWF’s ensemble system EDA is described in detail in [Isaksen+10]; its implementation for ERA5 in [Hersbach+20]. The latter writes that “[t]he ERA5 EDA spread among the ten ensemble members can be interpreted as a measure for the uncertainty in the [higher-resolution reanalysis] estimates” and “should mainly be used as a guide for the quality of representing the correct synoptic situation at a given time, rather than for long-term and/or large-scale averages.” Consequently, while the ensemble spread should not be considered an uncertainty in the metrological sense, it can provide useful information on part of the uncertainty in the data. In ERA5–Drought, SPI and SPEI introduce additional complexity by accumulating over time, regridding in space, and using a 30-year climatology reference window. For a metrological guide to uncertainty propagation with ensemble/Monte Carlo methods, we refer the reader to [JCGM+08].

This quality assessment examines the uncertainty in SPI and SPEI values from the ERA5–Drought (Monthly drought indices from 1940 to present derived from ERA5 reanalysis) dataset with a focus on national drought monitoring. We investigate the ensemble spread in SPI and SPEI across one country (Australia) as well as the agreement between the deterministic reanalysis and individual ensemble members in terms of declaring drought states (such as “extremely dry”). In doing so, we assess whether the ensemble spread is sufficiently small for these data to underpin a drought monitoring programme, and how the ensemble can provide additional information for such programmes.

📢 Quality assessment statement#

These are the key outcomes of this assessment

The ensemble in ERA5–Drought provides useful information on the spread and thus uncertainty in SPI/SPEI drought indicators. The ensemble spread should be treated as a first-order estimate for the uncertainty, not a metrological uncertainty budget.
The ensemble spread in drought indicators is generally small (≤0.24 for SPI and ≤0.15 for SPEI in this case study) for pixels that comply with the included quality flags.
In most cases, the majority of ensemble members agree with the deterministic reanalysis in declaring a drought level. Therefore, ERA5–Drought can be used confidently for monitoring drought. The ensemble can be used to measure said confidence.
Disagreements between ensemble members and the deterministic reanalysis are the result of propagated uncertainty from ERA5 combined with methodological factors, most prominently the difference in spatial resolution between ensemble and deterministic reanalysis.
The ensemble provides new opportunities for more nuanced monitoring of drought, compared to a binary classification based on the deterministic reanalysis alone. For example, triggers could be set based on the number of ensemble members that pass an SPI or SPEI threshold to capture a wider and smoother area of potential drought and provide early warnings.

📋 Methodology#

This quality assessment examines the uncertainty in SPI and SPEI values from the ERA5–Drought (Monthly drought indices from 1940 to present derived from ERA5 reanalysis) dataset with a focus on national drought monitoring.

This notebook provides and runs through Python code for downloading the deterministic reanalysis and ensemble for SPI and SPEI from ERA5–Drought with the included quality flags, analysing the spread in the ensemble and its agreement with the deterministic reanalysis, and assigning drought warning levels (e.g. “severe drought”) across a country (here Australia) during a known drought event. This calculation is performed individually for each ensemble member, thus propagating the distribution of meteorological data (ERA5) through SPI/SPEI (ERA5–Drought) into the derived product (drought level). As noted above, this is not a full metrological uncertainty analysis and should be considered a first-order estimate.

This notebook is set up so that the test country and dates are easily customised, so you can download it and apply it to your preferred domain.

1. Code setup

Import all required libraries.
Define helper functions.

2. Download data

Download SPI and SPEI from ERA5–Drought.
Download quality flags from ERA5–Drought.
Pre-process data.

3. Drought indicators

Analyse SPI and SPEI geospatially with and without quality flags.

4. Drought classification

Assign drought levels to the deterministic reanalysis and individual ensemble members.
Assess the uncertainty in drought levels by comparing the reanalysis and ensemble.

📈 Analysis and results#

1. Code setup#

Note

This notebook uses earthkit for downloading (earthkit-data) and visualising (earthkit-plots) data. Because earthkit is in active development, some functionality may change after this notebook is published. If any part of the code stops functioning, please raise an issue on our GitHub repository so it can be fixed.

Import required libraries#

In this section, we import all the relevant packages needed for running the notebook.

Helper functions#

This section defines some functions and variables used in the following analysis, allowing code cells in later sections to be shorter and ensuring consistency.

Data (pre-)processing#

The following functions handle downloading data in specific circumstances, e.g. a geographical or temporal subset:

The following functions aid in processing quality flag data from ERA5–Drought into a standard format:

Categorising SPI and SPEI#

The following cell defines categories for SPI and SPEI values, e.g. “severe drought”.

Quality flags#

The following functions apply the quality flags included in ERA5–Drought, such as the probability of zero precipitation and the Shapiro–Wilk normality test:

Statistics#

The following functions perform statistics on the ensemble datasets, both calculating the mean and standard deviation in ensemble members, and the drought severity category they fall in.

The following functions count the number of (mis)matches between the deterministic reanalysis and invididual ensemble members and tabulate the result:

Show code cell source

Hide code cell source

# Helper function
def count_how_often_met(counts: xr.Dataset, thresholds: Iterable[int]=range(1, 11)) -> pd.DataFrame:
    """
    Given a masked dataset of
    "how often does the ensemble meet a threshold if the reanalysis meets/doesn't meet the threshold"
    tabulate how often this is 0, >= 1, ...
    """
    # Columns in table: How often do n==0, >=1, ..., >=10 members meet the threshold?
    count_0 = {0: (counts == 0).sum()}  # 0 has to be done separately because it's ==, not >=
    count_n = {n: (counts >= n).sum() for n in thresholds}
    count_n = count_0 | count_n  # Combine into one dict

    # Normalise by total number of pixels under consideration (in %)
    count_n = {n: count / counts.count() * 100 for n, count in count_n.items()}

    # Combine into DataFrame
    count_n = {n: count.to_pandas() for n, count in count_n.items()}
    count_n = pd.DataFrame(count_n)

    # Nicer column labels
    column_names = {0: "n = 0"} | {n: f"n ≥ {n}" for n in thresholds}
    count_n = count_n.rename(columns=column_names)

    return count_n

# Helper function
def fraction_match_mismatch(data_reanalysis: xr.Dataset, data_ensemble: xr.Dataset, threshold: float, *,
                            category_label: str="in <drought category>",
                            ensemble_dimension: str=ENSEMBLE_DIMENSION, **kwargs) -> pd.DataFrame:
    """
    Given a reanalysis and ensemble dataset, count how many ensemble members meet the `threshold`
    (e.g. SPI is below the upper limit for "severe drought")
    for pixels where the reanalysis does / doesn't meet the same threshold.
    Applies to one category (threshold + label).
    """
    # Find pixels where the reanalysis meets the threshold (e.g. is at least moderate drought) and doesn't (e.g. is not moderate drought)
    reanalysis_meets_threshold       = (data_reanalysis <  threshold)
    reanalysis_doesnt_meet_threshold = (data_reanalysis >= threshold)  # Don't invert the mask because of NaNs

    # Count the number of ensemble members that meet the threshold (e.g. are at least moderate drought)
    ensemble_meets_threshold = (data_ensemble < threshold)
    ensemble_counts = ensemble_meets_threshold.sum(dim=ensemble_dimension)  # Number of members in category
    ensemble_counts = ensemble_counts.where(data_ensemble.sel({ensemble_dimension: 1}).notnull())  # Re-apply NaN mask

    # Match-up: Count the number of ensemble members that meet the threshold in the same pixel where the reanalysis meets/doesn't meet it
    # Basically match and mismatch, but with number of ensemble members instead of True/False
    ensemble_counts_where_reanalysis_meets_threshold       = ensemble_counts.where(reanalysis_meets_threshold)
    ensemble_counts_where_reanalysis_doesnt_meet_threshold = ensemble_counts.where(reanalysis_doesnt_meet_threshold)

    # Columns in table: How often do n==0, >=1, ..., >=10 members meet the threshold?
    meet_counts        = count_how_often_met(ensemble_counts_where_reanalysis_meets_threshold)
    doesnt_meet_counts = count_how_often_met(ensemble_counts_where_reanalysis_doesnt_meet_threshold)

    return meet_counts, doesnt_meet_counts

# Helper function
def fraction_match_mismatch_with_formatting(data_reanalysis: xr.Dataset, data_ensemble: xr.Dataset,
                                        indicator: str, categories: Iterable[Iterable], *,
                                        ensemble_dimension: str=ENSEMBLE_DIMENSION) -> tuple[pd.DataFrame]:
    # Apply fraction_match_mismatch to this indicator for each drought level
    match, mismatch = zip(*[fraction_match_mismatch(data_reanalysis[[indicator]], data_ensemble[[indicator]],
                                                    upper_lim, category_label=category_label,
                                                    ensemble_dimension=ensemble_dimension)
                            for (lower_lim, upper_lim, category_label, category_colour) in categories])

    # Apply labels
    match, mismatch = pd.concat(match), pd.concat(mismatch)
    match.index    = [f"Reanalysis is {category_label} or drier"
                      for (lower_lim, upper_lim, category_label, category_colour) in categories]
    mismatch.index = [f"Reanalysis is less dry than {category_label}"
                      for (lower_lim, upper_lim, category_label, category_colour) in categories]

    return match, mismatch

# Helper function
def display_fraction_match_mismatch_with_formatting(df: pd.DataFrame, caption: str) -> None:
    display_html(df.style.set_caption(caption).format(precision=1))


# Main function: Display the fraction of matching / mismatching match-ups between reanalysis and ensemble members in tables
def display_fraction_match_mismatch(data_reanalysis: xr.Dataset, data_ensemble: xr.Dataset, *,
                                    indicators: Iterable[str]=["SPI", "SPEI"],
                                    drought_categories_spi:  Iterable[Iterable]=CATEGORIES_SPI[5:],
                                    drought_categories_spei: Iterable[Iterable]=CATEGORIES_SPEI[5:],
                                    ensemble_dimension: str=ENSEMBLE_DIMENSION) -> None:
    """
    Given a reanalysis and ensemble dataset, count how many ensemble members fall _at least_ into each drought category
    (e.g. SPI is below the upper limit for "severe drought")
    for pixels where the reanalysis does / doesn't meet the same threshold.
    """
    # Separate SPI and SPEI because they have different categories and thresholds
    indicators_spi, indicators_spei = [sorted([var for var in data_reanalysis.data_vars if indicator_base in var])
                                       for indicator_base in indicators]  # e.g. [SPI1, SPI12], [SPEI1, SPEI12]

    nspi, nspei = len(indicators_spi), len(indicators_spei)
    indicators_combined = indicators_spi + indicators_spei
    categories_combined = [drought_categories_spi,] * nspi + [drought_categories_spei,] * nspei  # Repeat categories so they can be looped over

    # Get labels for all unique categories in SPI and/or SPEI
    category_labels_combined = [category[2] for category in drought_categories_spei]  # Get SPEI labels
    category_labels_combined = category_labels_combined + [category[2] for category in drought_categories_spi  # Get SPI labels
                                                          if category[2] not in category_labels_combined]       # if they weren't in SPEI too

    # Loop over rows / indicators: SPI then SPEI
    match, mismatch = zip(*[fraction_match_mismatch_with_formatting(data_reanalysis, data_ensemble, indicator, categories,
                                                                    ensemble_dimension=ensemble_dimension)
                            for indicator, categories in zip(indicators_combined, categories_combined)])

    # Re-organise into dict with SPI / SPEI labels, making sure to use the same order
    match, mismatch = [pd.concat({INDEXES_NAMED[indicator]: df for indicator, df in zip(indicators_combined, data)})
                       for data in [match, mismatch]]

    # Reorder
    match    =    match.unstack(level=0).loc[[f"Reanalysis is {category_label} or drier" for category_label in category_labels_combined]]
    mismatch = mismatch.unstack(level=0).loc[[f"Reanalysis is less dry than {category_label}" for category_label in category_labels_combined]]

    # Display subsets of data and draw conclusions
    caption = ("Percentage of match-ups (pixel + timestamp) where "
               "none of the ensemble members (n = 0) agree with the reanalysis in declaring a state of drought.<br>"
               "High values indicate disagreement between the two datasets, and thus high uncertainty.")
    display_fraction_match_mismatch_with_formatting(match[["n = 0"]], caption)
    display_html("<br><br>", raw=True)  # Spacing

    caption = ("Percentage of match-ups (pixel + timestamp) where "
               "at least half the ensemble members (n ≥ 5) agree with the reanalysis in declaring a state of drought.<br>"
               "High values indicate some confidence in the drought classification.")
    display_fraction_match_mismatch_with_formatting(match[["n ≥ 5"]], caption)
    display_html("<br><br>", raw=True)  # Spacing

    caption = ("Percentage of match-ups (pixel + timestamp) where "
               "(nearly) all the ensemble members (n ≥ 9) agree with the reanalysis in declaring a state of drought.<br>"
               "High values indicate high confidence / low uncertainty in the drought classification.")
    display_fraction_match_mismatch_with_formatting(match[["n ≥ 9"]], caption)

    display_html("<br><hr><hr><br>", raw=True)  # Spacing between sections
    
    caption = ("Percentage of match-ups (pixel + timestamp) where "
               "the reanalysis does not declare a state of drought, and none of the ensemble members (n = 0) declare one either.<br>"
               "High values indicate high confidence / low uncertainty in the not-drought classification.")
    display_fraction_match_mismatch_with_formatting(mismatch[["n = 0"]], caption)
    display_html("<br><br>", raw=True)  # Spacing

    caption = ("Percentage of match-ups (pixel + timestamp) where "
               "the reanalysis does not declare a state of drought, but at least one of the ensemble members (n ≥ 1) does.<br>"
               "High values indicate low confidence in the not-drought classification, and/or potential for early warnings.")
    display_fraction_match_mismatch_with_formatting(mismatch[["n ≥ 1"]], caption)

Visualisation#

The following cell defines earthkit-plots styles for the indicators. These styles define the colour maps and colour bar ranges for each quantity.

Show code cell source

Hide code cell source

# Helper function: Translate tabulated categories to Style
def categories_to_earthkit_style(var_categories: Iterable[Iterable]) -> Style:  # e.g. CATEGORIES_SPI
    """ From a list of categories, create an earthkit.plots Style. """
    cmap = ListedColormap([category[3] for category in var_categories[::-1]])  # Colours in reverse (ascending) order
    norm =   BoundaryNorm([category[0] for category in var_categories[-2::-1]], cmap.N, extend="both")
    style = {"cmap": cmap, "norm": norm, "extend": "both"}
    return style

# Helper functions: Generate linear colour maps up to a specified colour
def categorical_colourmap(end: str, *, start: str="#000000", N: int=11) -> LinearSegmentedColormap:
    """ Generate a linear colour map from `start` (default: black) to `end` (e.g. red) in `N` steps (default: 11). """
    return LinearSegmentedColormap.from_list("", [start, end], N=N)

def is_in_category_style(end: str, *args, **kwargs) -> Style:
    """ Generate an earthkit-plots Style for 'how many members in this category'. """
    common = {"vmin": 0, "vmax": 1, "extend": "neither", "normalize": False}
    colour = {"cmap": categorical_colourmap(end, *args, **kwargs)}
    return Style(**common, **colour)

# Styles for SPI, SPEI based on Keune+25 colours
_style_categories_spi  = categories_to_earthkit_style(CATEGORIES_SPI)
_style_categories_spei = categories_to_earthkit_style(CATEGORIES_SPEI)
_style_spi = _style_spei = {"cmap": plt.cm.RdBu, "vmin": -3, "vmax": 3, "extend": "both"}
_style_spi_spread = _style_spei_spread = {"cmap": plt.cm.cividis, "vmin": 0, "vmax": 1, "extend": "max"}

styles = {
    "SPI": Style(**_style_spi), "SPI_spread": Style(**_style_spi_spread),
    "SPEI": Style(**_style_spei), "SPEI_spread": Style(**_style_spei_spread),
    "SPI_categories": Style(**_style_categories_spi),
    "SPEI_categories": Style(**_style_categories_spei),
}

# Apply general settings
for style in styles.values():
    style.normalize = False

The following cell contains some base helper functions (e.g. displaying in Jupyter Notebook or Jupyter Book style, adding textboxes with consistent formatting, etc.):

The following functions are also base helper functions, but specific to geospatial plots. Country boundaries are downloaded from the Geographic Information System of the Commission (GISCO) through earthkit-plots, and are © EuroGeographics.

The following functions display geospatial comparisons of the deterministic reanalysis and ensemble member mean/spread:

Show code cell source

Hide code cell source

def plot_reanalysis_vs_ensemble(data_reanalysis: xr.Dataset, data_ensemble: xr.Dataset, indicator: str, *,
                                dates: Optional[Iterable[pd.Timestamp | str]]=None,
                                domain: Optional[AnyDomain]=None,
                                time_dimension: str="time",
                                glue_label: Optional[str]=None) -> ekp.Figure:                   
    """
    Plot data from a reanalysis and ensemble (calculating the mean and spread) side-by-side geospatially.
    Loops over the specified dates; if none are specified, then loops over all dates.
    """
    # Find dates if none were provided
    if not dates:
        try:
            dates = data_reanalysis[time_dimension]
        except KeyError:
            raise ValueError(f"No dates provided and cannot find them on {time_dimension} dimension in data_reanalysis.")
    dates = pd.to_datetime(dates)  # Convert to Pandas timestamps for formatting

    # Determine if SPI or SPEI
    spi_or_spei = "SPEI" if "E" in indicator else "SPI"
    indicator_label = INDEXES_NAMED[indicator]

    # Calculate ensemble mean and spread
    mean_ensemble, spread_ensemble = ensemble_mean_and_spread(data_ensemble)
    PANEL_LABELS = ["reanalysis", "ensemble mean", "ensemble spread",]

    # Create figure
    nrows = len(dates)
    ncols = 3
    fig = ekp.Figure(rows=nrows, columns=ncols, size=(4*ncols, 4*nrows))

    # Loop over dates and plot data
    for date in dates:
        # Select data
        reanalysis_today, mean_today, spread_today = [data.sel({time_dimension: date})
                                                      for data in [data_reanalysis, mean_ensemble, spread_ensemble]]
        # Plot data
        subplots = [fig.add_map(domain=domain) for j in range(ncols)]
        subplots[0].grid_cells(reanalysis_today, z=indicator, style=styles[f"{spi_or_spei}"])
        subplots[1].grid_cells(mean_today,       z=indicator, style=styles[f"{spi_or_spei}"])
        subplots[2].grid_cells(spread_today,     z=indicator, style=styles[f"{spi_or_spei}_spread"])

        # Label dates
        _add_textbox_to_subplots(f"{date:%B %Y}", *subplots)

    # Colour bar on last row
    for subplot, label in zip(subplots, PANEL_LABELS):
        subplot.legend(location="bottom", label=f"{indicator_label}\n({label})")

    # Labels on top row
    for subplot, label in zip(fig.subplots[:ncols], PANEL_LABELS):
        subplot.title(label)

    # Decorate figures
    domain_name = f"over {domain.name}" if domain is not None else ""
    decorate_fig(fig, title=f"ERA5–Drought {indicator_label} {domain_name}")

    # Show result
    _glue_or_show(fig.fig, glue_label)

The following functions display geospatial comparisons of the deterministic reanalysis and ensemble members in terms of SPI / SPEI categories:

Show code cell source

Hide code cell source

def plot_drought_category_comparison(data_reanalysis: xr.Dataset, data_ensemble: xr.Dataset,
                                     indicator: str, categories: Iterable[Iterable], *,
                                     dates: Optional[Iterable[pd.Timestamp | str]]=None,
                                     drought_starting_index: int=0,
                                     domain: Optional[AnyDomain]=None,
                                     time_dimension: str="time", ensemble_dimension: str=ENSEMBLE_DIMENSION,
                                     glue_label: Optional[str]=None) -> ekp.Figure:                   
    """
    Plot drought categories from a reanalysis and ensemble side-by-side geospatially.
    For the reanalysis, simply show the indicator values with a categorical colour map.
    For the ensemble, show the # of ensemble members in each category, with one panel per category.
        `drought_starting_index` can be used to specify at which category to start the ensemble plots
        (e.g. you only want the drought ones).
    Loops over the specified dates; if none are specified, then loops over all dates.
    """
    # Find dates if none were provided
    if not dates:
        try:
            dates = data_reanalysis[time_dimension]
        except KeyError:
            raise ValueError(f"No dates provided and cannot find them on {time_dimension} dimension in data_reanalysis.")
    dates = pd.to_datetime(dates)  # Convert to Pandas timestamps for formatting

    # Determine if SPI or SPEI
    spi_or_spei = "SPEI" if "E" in indicator else "SPI"
    indicator_label = INDEXES_NAMED[indicator]
    categories_drought = categories[drought_starting_index:]  # e.g. 5: for CATEGORIES_SPI

    # Create figure
    nrows = len(dates)
    ncols = 1 + len(categories_drought)
    fig = ekp.Figure(rows=nrows, columns=ncols, size=(4*ncols, 4*nrows))

    # Loop over dates and plot data
    for date in dates:
        # Select data
        reanalysis_today, ensemble_today = [data.sel({time_dimension: date})
                                            for data in [data_reanalysis, data_ensemble]]

        # Plot reanalysis
        subplots = [fig.add_map(domain=domain) for j in range(ncols)]
        subplots[0].grid_cells(reanalysis_today, z=indicator, style=styles[f"{spi_or_spei}_categories"])

        # Loop over ensemble categories and plot data
        for (lower_lim, upper_lim, category_label, category_colour), subplot in zip(categories_drought, subplots[1:]):
            # Find ensemble members > lower lim
            in_this_category = (ensemble_today[[indicator]] < upper_lim)  # Boolean mask
            in_this_category = in_this_category.sum(dim=ensemble_dimension)  # Count
            in_this_category = in_this_category / ensemble_today.sizes[ensemble_dimension]  # Relative count
            in_this_category = in_this_category.where(ensemble_today.sel({ensemble_dimension: 1}).notnull())  # Re-apply NaN mask

            # Plot data
            subplot.grid_cells(in_this_category, z=indicator, style=is_in_category_style(category_colour))

        # Label dates
        _add_textbox_to_subplots(f"{date:%B %Y}", *subplots)

    # Colour bar on last row
    colourbar_labels = [indicator_label,] + ["fraction of ensemble members" for category in categories_drought]
    for subplot, label in zip(subplots, colourbar_labels):
        subplot.legend(location="bottom", label=label)

    # Labels on top row
    panel_labels = ["reanalysis",] + [f"{category[2]} or drier" for category in categories_drought]
    for subplot, label in zip(fig.subplots[:ncols], panel_labels):
        subplot.title(label)

    # Decorate figures
    domain_name = f"over {domain.name}" if domain is not None else ""
    decorate_fig(fig, title=f"ERA5–Drought {indicator_label} {domain_name}")

    # Show result
    _glue_or_show(fig.fig, glue_label)

2. Download data#

General setup#

This notebook uses earthkit-data to download files from the CDS. If you intend to run this notebook multiple times, it is highly recommended that you enable caching to prevent having to download the same files multiple times. If you prefer not to use earthkit, the following requests can also be used with the cdsapi module. In either case (earthkit-data or cdsapi), it is required to set up a CDS account and API key as explained on the CDS website.

In this example, we will be looking at a single drought event in one country, namely Australia. This country is defined in the following cell and can be edited when running this notebook yourself:

Identify a drought event#

We use EM-DAT [Delforge+25], the international disaster database maintained by UC Louvain’s Centre for Research on the Epidemiology of Disasters (CRED), to identify historical drought events in our chosen country (here Australia) and pick one to analyse further. EM-DAT data can be downloaded from https://public.emdat.be/ This requires registration with an account and appropriate licence. For this notebook, we assume you have downloaded data for your chosen country already. The filename can be specified below:

	Start Year	Start Month	Location	End Year	End Month
DisNo.
1967-9012-AUS	1967	NaN	South-East	1969	NaN
1974-9100-AUS	1974	12.0	Central New South Wales	1975	1.0
1976-9126-AUS	1976	NaN	Victoria, South Australia, Southern Inland New...	1976	NaN
1976-9127-AUS	1976	5.0	Southwest of Western Australia	1976	8.0
1978-9200-AUS	1978	NaN	Southern Queensland, New South Wales, Victoria...	1978	NaN
1981-9188-AUS	1981	NaN	Queensland, New South Wales, Victoria States	1982	NaN
1991-9697-AUS	1991	NaN	Western, South	1991	NaN
1992-9543-AUS	1992	12.0	Queensland	1995	12.0
2006-9554-AUS	2006	10.0	NO DATA: all country has been selected	2006	NaN
2002-9611-AUS	2002	4.0	Queensland, New South Wales provinces	2002	11.0
2018-9258-AUS	2018	1.0	New South Wales, Victoria , Queensland	2018	8.0

Based on the table above, we choose the 2018 drought event, which has well-defined start and end dates and spans a large part of the country. In the following cell, we manually define the dates for the analysis, which can be modified when running this notebook yourself:

Setup for CDS downloads#

We will be downloading different types of data (drought indicators and quality flags, deterministic reanalysis and ensemble) in this notebook. CDS data requests take the form of dictionaries in Python. When making multiple requests, it is convenient to define some constants and set up template requests with default parameters. In this section, we define our template containing those parameters that are constant between datasets: the domain in time and space. This way, these are guaranteed to be consistent between downloads and only need to be changed in one place if you wish to modify the notebook for your own use case.

We want to download SPI, SPEI, and the associated quality flags for the 1- and 12-month accumulation windows. The country and timespan were defined in the previous section; the variables and accumulation periods are defined in the following cell, and can be edited when running this notebook yourself:

The CDS request template is then defined using the values specified above. Additional information to download specific data variables will be mixed into this template in the following sections.

3. Drought indicators#

We examine the distribution of SPI and SPEI to get an intuition for circumstances in this particular drought event and the spread of the ensemble. First, this is done without applying quality flags:

SPI-1

../_images/67c1e87f38ce30ed6ee7fd725100775c4fc3e8f4f7a7d431866b85e0c7267f0f.png — Fig. 6.1.1 SPI-1 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) and the per-pixel mean (middle) and spread (right) of the ensemble members. Quality flags have not been applied.#

SPI-12

../_images/6367752a1200757597562ec75c358ca2a121a1789d164c128fc44f89c843dbe2.png — Fig. 6.1.2 SPI-12 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) and the per-pixel mean (middle) and spread (right) of the ensemble members. Quality flags have not been applied.#

SPEI-1

../_images/48a8e8337573b88d96e50f197c7881111270c89b580e0a14245ff47a1c7ffeb0.png — Fig. 6.1.3 SPEI-1 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) and the per-pixel mean (middle) and spread (right) of the ensemble members. Quality flags have not been applied.#

SPEI-12

../_images/127899a03736ce71bc820d2c493f19a2df788f93e31d0c58a6493e78b3ee7bea.png — Fig. 6.1.4 SPEI-12 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) and the per-pixel mean (middle) and spread (right) of the ensemble members. Quality flags have not been applied.#

The geospatial distributions for SPI (Figures 6.1.1–6.1.2) and SPEI (Figures 6.1.3–6.1.4) both show good agreement between the deterministic reanalysis and the ensemble mean. Both datasets generally show the same patterns with the same magnitudes, although SPI-1 and SPI-12 show differences in some areas, such as the central northern coastal area (Arnhem Land and surroundings). The most striking difference is that the ensemble mean appears much smoother, since ERA5’s ensemble members are run on a coarser grid than the deterministic reanalysis and subsequently interpolated down to the same grid in ERA5–Drought. The averaging over 10 members may cause additional smoothing of random noise.

The ensemble spread in SPI-1 and SPI-12 is very large (≥1) in some areas, particularly in January (local summer). Some of these areas correspond to noisy-looking areas in the reanalysis and ensemble mean. This pattern does not appear at all in SPEI-1 and SPEI-12, which have a spatially smooth and much smaller ensemble spread. The difference is likely due to the two indicators using different underpinning distributions (gamma vs. generalised logistic) and data (precipitation vs. precipitation – potential evaporation). This is explored further in a forthcoming notebook on the reproducibility of the ERA5–Drought dataset.

Applying the provided quality flags removes most of the noisy-appearing and high-spread areas. This includes most of the continent for SPI-1 (Figure 6.1.5) because SPI-1 is the most sensitive to dry months – which occur often in inland Australia. More data are retained for SPI-12 and SPEI (Figures 6.1.6–6.1.8), although significant gaps remain. This is particularly true for the ensemble mean and spread, which we have masked if any ensemble member is flagged at a particular pixel and timestamp to maintain statistical robustness.

In the retained areas, the ensemble spread is generally small, with the median over the entire domain and all three dates being ≤0.24 for SPI and ≤0.15 for SPEI. However, what matters most for drought monitoring is not the uncertainty in SPI or SPEI values themselves but in the corresponding drought levels. This will be explored in the next section.

SPI-1

../_images/92cbd2c989a2543b66b52a501f83585d94d9456850451088dc10de3fced5a4d9.png — Fig. 6.1.5 SPI-1 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) and the per-pixel mean (middle) and spread (right) of the ensemble members. Quality flags for p₀ and normality have been applied.#

SPI-12

../_images/5c4c209ec2417ffe18f636caaea175aff53d3b36d5133e67acc9f9cd072e8920.png — Fig. 6.1.6 SPI-12 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) and the per-pixel mean (middle) and spread (right) of the ensemble members. Quality flags for p₀ and normality have been applied.#

SPEI-1

../_images/e53a5d17b02badd30bfb068ec9b7b7ef22070cd2ccb1c60ac23b0bdfb2d800da.png — Fig. 6.1.7 SPEI-1 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) and the per-pixel mean (middle) and spread (right) of the ensemble members. Quality flags for normality have been applied.#

SPEI-12

../_images/8f365f353e1c90bbf29d27f87b2b955a38a1b977ba37adb9ef5345727c2ca58a.png — Fig. 6.1.8 SPEI-12 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) and the per-pixel mean (middle) and spread (right) of the ensemble members. Quality flags for normality have been applied.#

4. Drought classification#

Because SPI and SPEI are defined in probabilistic terms, there is a relatively simple mapping from their values to corresponding drought categories like “mild drought” or “extreme drought” [McKee+93]. Drought monitoring generally focuses on these categories. For example, governmental mechanisms for drought relief may be triggered when SPI or SPEI dips below the threshold for one of these pre-defined categories [Steinemann+06]. The thresholds used in this quality assessment follow [Keune+25].

Here, we assess the uncertainty in declaring drought levels based on how many of the ensemble members match the reanalysis. For example, if the reanalysis suggests a severe or worse drought (SPI < –1.5), and all 10 ensemble members say the same, one can be reasonably confident in declaring a severe drought. However, if only 2 ensemble members say the same, with the other 8 suggesting only moderate (–1.0 > SPI > –1.5) or even no drought (SPI > –1.0), then the uncertainty is too large to make a definitive statement. The opposite case is also interesting – if the reanalysis says there is no drought but several ensemble members say there is, this could serve as an early warning or as the basis for a more nuanced assessment of the situation on the ground.

Show code cell output

Hide code cell output

Percentage of match-ups (pixel + timestamp) where none of the ensemble members (n = 0) agree with the reanalysis in declaring a state of drought.
High values indicate disagreement between the two datasets, and thus high uncertainty.
	n = 0
	SPI-1	SPI-12	SPEI-1	SPEI-12
Reanalysis is Mildly dry or drier	nan	nan	1.6	1.5
Reanalysis is Moderately dry or drier	7.2	3.1	2.4	3.3
Reanalysis is Severely dry or drier	11.3	3.9	3.5	3.0
Reanalysis is Extremely dry or drier	15.9	3.8	6.0	10.3

Percentage of match-ups (pixel + timestamp) where at least half the ensemble members (n ≥ 5) agree with the reanalysis in declaring a state of drought.
High values indicate some confidence in the drought classification.
	n ≥ 5
	SPI-1	SPI-12	SPEI-1	SPEI-12
Reanalysis is Mildly dry or drier	nan	nan	94.3	91.7
Reanalysis is Moderately dry or drier	76.9	84.5	92.7	90.1
Reanalysis is Severely dry or drier	73.8	85.8	84.8	89.6
Reanalysis is Extremely dry or drier	73.2	72.9	90.0	44.8

Percentage of match-ups (pixel + timestamp) where (nearly) all the ensemble members (n ≥ 9) agree with the reanalysis in declaring a state of drought.
High values indicate high confidence / low uncertainty in the drought classification.
	n ≥ 9
	SPI-1	SPI-12	SPEI-1	SPEI-12
Reanalysis is Mildly dry or drier	nan	nan	86.4	79.5
Reanalysis is Moderately dry or drier	54.7	67.3	82.6	80.9
Reanalysis is Severely dry or drier	56.5	69.5	64.9	71.0
Reanalysis is Extremely dry or drier	51.2	36.2	66.7	9.2

Percentage of match-ups (pixel + timestamp) where the reanalysis does not declare a state of drought, and none of the ensemble members (n = 0) declare one either.
High values indicate high confidence / low uncertainty in the not-drought classification.
	n = 0
	SPI-1	SPI-12	SPEI-1	SPEI-12
Reanalysis is less dry than Mildly dry	nan	nan	87.2	82.6
Reanalysis is less dry than Moderately dry	90.7	85.4	90.5	89.8
Reanalysis is less dry than Severely dry	93.6	93.0	92.3	94.4
Reanalysis is less dry than Extremely dry	96.4	94.5	99.4	99.3

Percentage of match-ups (pixel + timestamp) where the reanalysis does not declare a state of drought, but at least one of the ensemble members (n ≥ 1) does.
High values indicate low confidence in the not-drought classification, and/or potential for early warnings.
	n ≥ 1
	SPI-1	SPI-12	SPEI-1	SPEI-12
Reanalysis is less dry than Mildly dry	nan	nan	12.8	17.4
Reanalysis is less dry than Moderately dry	9.3	14.6	9.5	10.2
Reanalysis is less dry than Severely dry	6.4	7.0	7.7	5.6
Reanalysis is less dry than Extremely dry	3.6	5.5	0.6	0.7

Most of the time (73–94% of match-ups), at least half of the ensemble members (n ≥ 5) agree with the reanalysis in declaring a drought level. This is particularly true for SPEI (≥85%), matching the smaller spread seen in the previous section. “Extremely dry or drier” SPEI-12 is an outlier, but this is likely a fluke caused by the small number of pixels falling into said category. It is rare (1–16% of match-ups) for none of the ensemble members (n = 0) to match the drought level in the reanalysis, and correspondingly it is common for none of the ensemble members to declare a drought when the reanalysis does not (e.g. out of the pixels and timestamps where SPI-1 is not moderately dry or drier, only 9% have one or more ensemble members that are moderately dry or drier). Full or near-full agreement (n ≥ 9 or 10) occurs in around half to three quarters of match-ups. In conclusion, the drought indicators in ERA5–Drought can be used with confidence in drought monitoring, and the included ensemble can be used to measure said confidence.

There are two main causes for mismatches between the ensemble and reanalysis, or equivalently for uncertainty in the declared drought levels. First, the ensemble propagates the uncertainty in the underpinning ERA5 data. Second, because the ensemble is generated at a coarser resolution than the deterministic reanalysis and then downscaled, pixels in the ensemble represent conditions in a wider spatial area. Both factors particularly affect the edges of drought areas, where the uncertainty in ERA5 straddles the threshold of a drought level and the ensemble combines drier and less dry areas. Indeed, this pattern can be seen in the geospatial distributions of matches and mismatches (Figures 6.1.9–6.1.12). Other methodological factors (discussed in the introduction) likely contribute further to the uncertainty.

The ensemble also provides new opportunities for more nuanced monitoring of drought, compared to a binary classification based on the deterministic reanalysis alone. As suggested above, additional triggers could be set based on the number of ensemble members that pass an SPI or SPEI threshold. The geospatial distributions (Figures 6.1.9–6.1.12) show that such an approach would capture a wider and smoother area of potential drought, which can then be targeted for follow-up measurements or early warnings.

SPI-1

../_images/d83d806d5d8b812b12c64db42eece3c2fd93cc82fbf8f321995d4b7225fb4ca6.png — Fig. 6.1.9 SPI-1 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) categorised following [Keune+25] and the fraction of ensemble members (out of 10) per pixel that fall into moderate-to-extreme drought categories. Quality flags for p₀ and normality have been applied.#

SPI-12

../_images/5aa3bf881b0e4db62a690caeb0abc34d06f25c7b33e5bd02e031d682403885e7.png — Fig. 6.1.10 SPI-12 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) categorised following [Keune+25] and the fraction of ensemble members (out of 10) per pixel that fall into moderate-to-extreme drought categories. Quality flags for p₀ and normality have been applied.#

SPEI-1

../_images/9b191482a7b9d1dbf01f94b3644e05bf29468f6007b198ef9d8ddfe79c53b118.png — Fig. 6.1.11 SPEI-1 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) categorised following [Keune+25] and the fraction of ensemble members (out of 10) per pixel that fall into moderate-to-extreme drought categories. Quality flags for normality have been applied.#

SPEI-12

../_images/b818220c3accd2604f9bcfd04cb7c88dcad01e392f4a5f418e09433d42edadd4.png — Fig. 6.1.12 SPEI-12 over Australia at the start (top), middle, and end (bottom) of the selected drought event, showing the deterministic reanalysis (left) categorised following [Keune+25] and the fraction of ensemble members (out of 10) per pixel that fall into moderate-to-extreme drought categories. Quality flags for normality have been applied.#

ℹ️ If you want to know more#

Key resources#

The CDS catalogue entries for the data used were:

Monthly drought indices from 1940 to present derived from ERA5 reanalysis: derived-drought-historical-monthly

Code libraries used:

earthkit
- earthkit-data
- earthkit-plots

More about drought indicators:

More about drought monitoring:

More about ERA5:

More about uncertainty in ensemble reanalyses:

References#

[Delforge+25] D. Delforge et al., ‘EM-DAT: the Emergency Events Database’, International Journal of Disaster Risk Reduction, vol. 124, p. 105509, Jun. 2025, doi: 10.1016/j.ijdrr.2025.105509.

[ECIU+25] Energy & Climate Intelligence Unit, ‘Estimated financial losses faced by UK farmers due dry weather impacts on key arable crops’, Energy & Climate Intelligence Unit, London, United Kingdom, Dec. 2025.

[Evenflow+24] Evenflow, ‘The value generated by ERA5’, Copernicus Climate Change Service (C3S), Bonn, Germany, Dec. 2024. [Online]. Available: https://climate.copernicus.eu/sites/default/files/2024-12/Value-generated-by-ERA5-full-report.pdf

[Franch-Pardo+25] I. Franch-Pardo, P. A. F. Puig, and A. Cerdà, ‘Geospatial Technologies in Crisis Response: Analyzing the 2024 Floods in Valencia, Spain’, European Journal of Geography, vol. 16, no. 2, pp. 286–297, Aug. 2025, doi: 10.48088/ejg.i.fra.16.2.286.297.

[Hersbach+20] H. Hersbach et al., ‘The ERA5 global reanalysis’, Quarterly Journal of the Royal Meteorological Society, vol. 146, no. 730, pp. 1999–2049, May 2020, doi: 10.1002/qj.3803.

[IPCC+23] IPCC, ‘Climate Change 2023: Synthesis Report. Contribution of Working Groups I, II and III to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change’, Intergovernmental Panel on Climate Change (IPCC), Geneva, Switzerland, Jul. 2023. doi: 10.59327/IPCC/AR6-9789291691647.

[Isaksen+10] L. Isaksen et al., ‘Ensemble of data assimilations at ECMWF’, European Centre for Medium-Range Weather Forecasts, Reading, UK, 636, Dec. 2010. doi: 10.21957/obke4k60.

[JCGM+08] JCGM, ‘Evaluation of measurement data — Supplement 1 to the “Guide to the expression of uncertainty in measurement” — Propagation of distributions using a Monte Carlo method’, Joint Committee for Guides in Metrology, Sevres, Paris, France, 101:2008, 2008. doi: 10.59161/JCGM101-2008.

[Keune+25] J. Keune, F. Di Giuseppe, C. Barnard, E. Damasio da Costa, and F. Wetterhall, ‘ERA5–Drought: Global drought indices based on ECMWF reanalysis’, Scientific Data, vol. 12, p. 616, Apr. 2025, doi: 10.1038/s41597-025-04896-y.

[McKee+93] T. B. McKee, N. J. Doesken, and J. Kleist, ‘The relationship of drought frequency and duration to time scales’, in Eighth Conference on Applied Climatology, Anaheim, California, USA, Jan. 1993.

[Shapiro+65] S. S. Shapiro and M. B. Wilk, ‘An analysis of variance test for normality (complete samples)’, Biometrika, vol. 52, no. 3–4, pp. 591–611, Dec. 1965, doi: 10.1093/biomet/52.3-4.591.

[Soci+24] C. Soci et al., ‘The ERA5 global reanalysis from 1940 to 2022’, Quarterly Journal of the Royal Meteorological Society, vol. 150, no. 764, pp. 4014–4048, Jul. 2024, doi: 10.1002/qj.4803.

[Steinemann+06] A. C. Steinemann and L. F. N. Cavalcanti, ‘Developing Multiple Indicators and Triggers for Drought Plans’, Journal of Water Resources Planning and Management, vol. 132, no. 3, pp. 164–174, May 2006, doi: 10.1061/(ASCE)0733-9496(2006)132:3(164).

[UNICEF+24] UNICEF, ‘Latin America and Caribbean Region Flash Update No. 2 (Climate-related crisis in the Amazon Region)’, UNICEF, Nov. 2024.

[Vicente-Serrano+10] S. M. Vicente-Serrano, S. Beguería, and J. I. López-Moreno, ‘A Multiscalar Drought Index Sensitive to Global Warming: The Standardized Precipitation Evapotranspiration Index’, Journal of Climate, vol. 23, no. 7, pp. 1696–1718, Apr. 2010, doi: 10.1175/2009JCLI2909.1.

Uncertainty in drought indicators for national monitoring

Contents

6.1. Uncertainty in drought indicators for national monitoring#

🌍 Use case: Monitoring drought on a national scale using reanalysis-based drought indicators#

❓ Quality assessment question#

📢 Quality assessment statement#

📋 Methodology#

📈 Analysis and results#

1. Code setup#

Import required libraries#

Helper functions#

Data (pre-)processing#

Categorising SPI and SPEI#

Quality flags#

Statistics#

Visualisation#

2. Download data#

General setup#

Identify a drought event#

Setup for CDS downloads#

Download SPI and SPEI#

Download quality flags#

3. Drought indicators#

4. Drought classification#

ℹ️ If you want to know more#

Key resources#

References#