The impact of scale and temporal trends in seasonal forecasts

4.1. The impact of scale and temporal trends in seasonal forecasts#

Production date: 30.04.2025

Produced by: Johannes Langvatn (METNorway), Johanna Tjernström (METNorway)

🌍 Use case: How effectively can seasonal forecasts predict the development of key global climate variables across varying spatial scales?#

❓ Quality assessment question#

How does skill in seasonal forecasts vary for different spatial-scales and regions of interest?
How does the quality in seasonal forecasts depend on temporal trends in the data?

The effectiveness of seasonal forecasts in predicting the development of key global climate variables—such as temperature, precipitation and sea surface temperatures (SSTs)—varies considerably by variable, region, season, lead time and the state of large-scale climate drivers like ENSO (El Niño–Southern Oscillation). This notebook investigates how a potential trend in the data (e.g. as discussed in Greuell et al. (2019) [1]) along with selection of regions at various spatial scales (e.g. as discussed in Prodhomme et al. (2021) [2], Gubler et al. (2020) [3] Quaglia et al. (2021) [4]) affect forecasting quality

📢 Quality assessment statement#

These are the key outcomes of this assessment

For regions and locations which have a visible trend in t2m, the trend inflates the correlation of t2m compared to reanalysis.
- For Europe, most of the visible correlation can be attributed to the underlying trend.
- Compared with the region of greater horn of Africa, where less of the visible correlation can be attributed to the underlying trend
There is a variation in forecast quality depending on selection of spatial scale and region.
The forecast quality of t2m varies depending on the choice of the regions or locations
- The regional mean for NINO3.4 is better correlated to the reanalysis than the global regional mean
- Bonn has low correlation and most of the quality can be attributed to the underlying trend in the dataset
- Addis Ababa has some correlation which is retained even after detrending

📋 Methodology#

This notebook provides a comprehensive assessment of the quality of seasonal monthly forecast anomalies (https://cds.climate.copernicus.eu/datasets/seasonal-monthly-single-levels) through comparison with the ERA5 reanalysis (https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-monthly-means). The analysis is carried out globally and for three regions chosen based on effects of trend on the correlation. Data were accessed from the CDS. Using this data the anomaly was calculated, and was determined assuming a linear trend. A dataset was then generated containing both the original anomaly and the detrended.

First, the global case was considered both including and excluding trend, plotting anomaly for the ensemble members, ensemble mean, and ERA5. The correlation was also calculated.

To further consider the correlation, maps were generated to more clearly view the spatial variations of correlation in anomaly (ref?) between the seasonal forecast and ERA5. Based on these maps, a set of regions were selected for further study, based on the magnitude of the correlation. Plots were then generated for nested domains. Three spatial scales were considered, a larger, continental scale, a smaller country scale and a much smaller city scale. Data were selected for each domain based on a lat/lon box.

The resulting plots, displaying the ensemble members, ensemble mean, and ERA5, aim to visualise the variation in quality based on the selected regions of interest. This was also done for the detrended anomaly, and correlation was also calculated for each of the domains.

1. Choose data to use and setup code

Choose a selection of forecast systems and model versions, hindcast period (normally 1993-2016), forecast and leadtime months
Compute the anomaly of the forecast and reanalysis data
The data is detrended assuming a linear trend in the data from start year to end year
Both the original and detrended data is saved for the following analysis

2. Comparing global correlation

Compute the ensemble mean of the forecasted anomaly
Plot the correlation of forecast and reanalysis data, before and after performing the detrending

3. Map of correlation for detrended data

Correlation is plotted for each grid point on a map
Based on this, regions are selected to investigate further; Addis Ababa, Bonn, NINO3.4.

4. Plot and describe results

Plots are generated for the regions chosen in the previous section

📈 Analysis and results#

1. Choose data to use and setup code#

This section contains the setup and data processing needed for performing the analysis

Import external code libraries needed

Define functions to be used throughout the notebook

Show code cell source Hide code cell source

def compute_anomaly(obj):
    climatology = obj.mean({"realization"} & set(obj.dims))
    climatology = diagnostics.time_weighted_mean(climatology, weights=False)
    return obj - climatology

def detrend(obj):
    trend = xr.polyval(obj["time"], obj.polyfit("time", deg=1).polyfit_coefficients)
    return obj - trend

def compute_monthly_anomaly(ds):
    (da,) = ds.data_vars.values()
    with xr.set_options(keep_attrs=True):
        da = da.groupby("time.month").map(compute_anomaly)
        da_detrend = da.groupby("time.month").map(detrend)
    da = xr.concat(
        [da.expand_dims(detrend=[False]), da_detrend.expand_dims(detrend=[True])],
        "detrend",
    )
    da.encoding["chunksizes"] = tuple(
        1 if dim in ("realization", "detrend") else size
        for dim, size in da.sizes.items()
    )
    return da.to_dataset()

def plot_func(dict_with_leadtime_keys, list_of_region_keys, fignums, plot_scale=3, time_range = None):
    if time_range is None:
        time_range = dict_with_leadtime_keys.keys()
    for detrend in [False, True]:
        for leadtime in time_range:
            region_dict = {key : master_dict[leadtime][key] for key in list_of_region_keys}
            
            num_plots = len(list_of_region_keys)
            num_cols = 4#int(np.ceil(num_plots/plot_scale))
            num_rows = num_plots // num_cols 
            if num_plots % num_cols != 0:
                num_rows += 1
            
            fig,axs = plt.subplots(num_rows,num_cols, figsize=(17,4))
            dtrnd = "(Detrended)" if detrend else ""
            fig.suptitle(f"Figure {fignums[0]}: Different regions - t2m, Leadtime: {leadtime+2}, Valid for: {months[leadtime]} "+ dtrnd)
            fignums.pop(0)
            count_row = 0
            count_col = 0
            for region_name, dictionary in region_dict.items(): 
                data = dictionary["data_sf"]
                lat = dictionary["lat"]
                lon = dictionary["lon"]
                da_reanalysis_mean = dictionary["data_era5"]
                mean_ens = dictionary["mean_ens"]
                if num_cols > 1 and num_rows > 1:
                    ax = axs[count_row,count_col]
                elif num_plots == 1:
                    ax = axs
                else:
                    ax = axs[count_col]
                
                corr = float(xr.corr(mean_ens.sel(detrend=detrend), da_reanalysis_mean.sel(detrend=detrend)))
                data.sel(detrend=detrend).plot.scatter(x="time",hue_style="bo", ax = ax, label ="Realization")
                mean_ens.sel(detrend=detrend).plot.scatter(x="time", color = "r", marker="o", ax = ax, label = "Realization Mean")
                da_reanalysis_mean.sel(detrend=detrend).plot.scatter(x="time", color = "r", marker="x", ax = ax, label = "ERA5")
                # ax.set_title(region_name + f" lat: [{lat.stop}, {lat.start}] lon: [{lon.start}, {lon.stop}], Corr = {corr:.3}")
                ax.set_title(region_name + f" Corr = {corr:.3}")
                with mpl.rc_context({"legend.fontsize": "small", "legend.framealpha" : 0.3}):
                    ax.legend()
                count_col += 1
                if count_col >= num_cols:
                    count_row += 1
                    count_col = 0
            plt.tight_layout()
            plt.show()

def plot_regions(region_dict,regions):
    fig = plt.figure(figsize=(10, 5))
    central = (region_dict[regions[0]]["lon"].start + region_dict[regions[0]]["lon"].stop)/2
    ax = fig.add_subplot(projection=ccrs.PlateCarree(central_longitude=central))
    
    # make the map global rather than have it zoom in to
    # the extents of any plotted data
    ax.set_extent([region_dict[regions[0]]["lon"].start,region_dict[regions[0]]["lon"].stop,
    region_dict[regions[0]]["lat"].start,region_dict[regions[0]]["lat"].stop])
    
    ax.coastlines()
    width = region_dict[regions[1]]["lon"].stop - region_dict[regions[1]]["lon"].start
    height = region_dict[regions[1]]["lat"].stop - region_dict[regions[1]]["lat"].start
    
    ax.add_patch(mpatches.Rectangle(xy=[region_dict[regions[1]]["lon"].start, region_dict[regions[1]]["lat"].start],
                                    width=width, height=height,
                                    facecolor='none', edgecolor='k',lw = 2,
                                    transform=ccrs.PlateCarree()))
    
    ax.plot(region_dict[regions[2]]["lon"].start,region_dict[regions[2]]["lat"].start,marker='o',transform=ccrs.PlateCarree())
    ax.set_title(f"Shows {regions[1]} [box] and {regions[2]} [blue circle] in the {regions[0]} region")
    plt.show()

2. Comparing global correlation#

Compute the ensemble mean of the seasonal forecast anomaly data

Plot the original anomaly data for each ensemble member, the ensemble mean, and the reanalysis anomaly.

Plot the detrended anomaly data for each ensemble member, the ensemble mean, and the reanalysis anomaly.

../_images/3cf254738274e47e1cac7a6166ab9ef6b89e7f5395e25cb6006f20816d32d639.png

Plot after detrending#

../_images/08578997a2ad2c3926c66d8304a73f21aaf356e2d044a072658d313f4073ad92.png

3. Map of correlation for detrended data#

The correlation of the detrended data is plotted compared to reanalysis data for each grid cell on a map

../_images/a5d31cc5db7694fbffd04efe679306f5187fe21ee833d2d1cf7147cb9012f7f9.png

../_images/870edb3a25e0961a4ff2764f00bc411af3c2a921a0f15f44640722ef5f9f2598.png

../_images/006d630753f654e66c0f952751cbc4abe39f86e45a79e622c8d847faa8044980.png

Selecting regions#

Based on the correlation maps, regions are selected to investigate further; Addis Ababa, Bonn, NINO3.4.

../_images/bb2f225ca0221ff4e7b00eb3cd5463c45fa6f3d719257aebbf745b9369f9c685.png

4. Plot and describe results#

In this section the regional means are computed and plots generated to illustrate scale variations for the selected regions

../_images/17dc8b373457db0c6e4409c07328ff596c06b2b9c85ed3a69ffe947db7bbb391.png

../_images/47e9f60108c0357376462d2f55cefe97e6fb2f59442ed89eb81e25770a47d36e.png

../_images/f2cd64322975daca5a65d564c6e1138e247e5328cd5dc5fd20bd317f561aeae3.png

../_images/d83b6c83d0ffcc2aa9bcda580e3f22d7a45ec27f1aca6d92dd01934a8b3d042e.png

../_images/0db0b7d5a794482a5326f321872f8f1c359908cfde9247fd5fef341787cf8c73.png

../_images/f63a0a4c498dd7a4cbb531c9cfccaf8b5555fcf387bfdd9273222826b6fbae48.png

../_images/b20baf2c838407e7b316963a834f9d8e0a7c9dd9132b04c60bae862a668834b0.png

../_images/15b31af099360191eb4460f527de3fd1f455ae86a35d8a798583ddee30cb1f01.png

../_images/24f5052fceba7dd632b8274a591c87831275290df1153232ec82dc3149507afc.png

How to read the results#

Looking at the comparison of different regions at different scales a few conclusions can be drawn. Firstly, for regions with a visible trend, that trend inflates the correlation for t2m compared against analysis. This can be seen for Europe, where comparing the detrended and trended plots, the correlation is greater for the data with trend in Europe. Hence, for Europe, most of the visible correlation can be attributed to the underlying trend.

This can be compared with the region of greater horn of Africa. Where comparing the trended and detrended case indicates that less of the visible correlation can be attributed to the underlying trend.

For the selected regions, forecast quality depending on scale varies. Looking at the Europe, Germany, Bonn case, there is a drop in forecast quality as the scale decreases. This cannot be observed in the NINO3.4 where the quality remains more or less constant, despite the decrease in region. The Greater Horn of Africa, where the drop off in quality with scale can be seen somewhat in the case of Addis Ababa, but not on the same scale as can be seen in the Europe case.

Considering this, and the anomalies for the different regions at different scales, it can be seen that the region of NINO3.4 has a better correlation with the reanalysis than the global regional mean, while Bonn has a low correlation where most of the quality can be attributed to the underlying trend. Addis Ababa has some correlation, which is retained even after detrending.

ℹ️ If you want to know more#

Key resources#

Seasonal forecast monthly statistics on single levels: 10.24381/cds.68dd14c3
ERA5 monthly averaged data on single levels from 1940 to present: 10.24381/cds.f17050d7

Code libraries used:#

C3S EQC custom functions, c3s_eqc_automatic_quality_control, prepared by B-Open
xarray
numpy
matplotlib
cartopy
linear_model from sklearn

References#

[1] Greuell, W., Franssen, W. H. P., and Hutjes, R. W. A.: Seasonal streamflow forecasts for Europe – Part 2: Sources of skill, Hydrol. Earth Syst. Sci., 23, 371–391, https://doi.org/10.5194/hess-23-371-2019, 2019.

[2] Prodhomme, C., Materia, S., Ardilouze, C. et al. Seasonal prediction of European summer heatwaves. Clim Dyn 58, 2149–2166 (2022). https://doi.org/10.1007/s00382-021-05828-3

[3] Gubler, S., and Coauthors, 2020: Assessment of ECMWF SEAS5 Seasonal Forecast Performance over South America. Wea. Forecasting, 35, 561–584, https://doi.org/10.1175/WAF-D-19-0106.1

[4] Calì Quaglia, F., Terzago, S. & von Hardenberg, J. Temperature and precipitation seasonal forecasts over the Mediterranean region: added value compared to simple forecasting methods. Clim Dyn 58, 2167–2191 (2022). https://doi.org/10.1007/s00382-021-05895-6