logo

5.1.7. Historical accuracy of sea ice extent in the CMIP6 projections#

Production date: 2024-05-31

Produced by: Timothy Williams, Nansen Environmental and Remote Sensing Center (NERSC)

🌍 Use case: Scientific assessment of sea ice extent in CMIP6 models to inform product creation#

❓ Quality assessment question#

  • How reliable are the historical estimates of the Arctic and Antarctic sea ice concentration and extent from the CMIP6 models?

📢 Quality assessment statement#

These are the key outcomes of this assessment

  • Bearing in mind that sea ice concentration estimates from passive microwave observations themselves are quite uncertain, we find that the errors between them and the CMIP6 models in the historical experiment can have significant biases and also have quite a large spread, so should be treated with some caution.

  • We find that the CMIP6 models generally underestimate the sea ice extent and area. Errors for the Antarctic are approximately double those in the Arctic. While the Arctic sea ice minima are generally quite accurate, the Arctic maxima are consistently underestimated, as are the Antarctic extrema.

  • At the time of the Arctic minimum, there is a general underestimation in the pack ice, but an overestimation in the marginal ice zone (MIZ) and near the coasts.

  • At the time of the Arctic maximum, the concentration in the pack is relatively unbiased, but there are some areas where there is strong underestimation (Bering Sea, Sea of Okhotsk)and overestimation (Greenland Sea, Labrador Sea).

  • Arctic December: one of the other CMIP6 quality assessments looks at projections of accessibility of Arctic shipping routes, where sea ice in December is particularly interesting. Therefore we also check the reliability of the December concentrations in the CMIP6 models. We find that the concentration in the pack is quite similar to the observations, although there is significant underestimation in Hudson Bay, and less pronounced underestimation in the Bering Sea. In later years, there is also slight underestimation off the Russian coast. The ice extent in the Greenland and Barents Sea is consistently overestimated.

  • At the time of the Antarctic minimum, there is strong underestimation in the Weddell, Bellingshausen and Amundsen Seas, while south of the Pacific and Indian Oceans, there is less bias. However, there there is too little ice at the coast and too much away from it.

  • At the time of the Antarctic maximum, there is in general too little ice everywhere, with the region away from the coast at longitude about 140W south of the Atlantic Ocean, and in the region south of the Indian Ocean having the most pronounced underestimation. The underestimation on those areas is also increasing with time, while the concentration from satellite is staying relatively constant in this month as well.

  • These results are generally consistent with analyses from other authors - for the Arctic by the SIMIP community (2020), Davy and Outten (2020), Shu et al (2020), Watts et al (2021), Henke et al (2023) and Frankignoul et al (2024); for the Antarctic by Roach et al (2020), Shu et al (2020), Nie et al (2023) and Li et al (2023).

  • We ourselves don’t consider time series of the sea ice minima themselves, but only plot climatologies of the multi-model mean to see how the differences between the models and the observations are distributed spatially. However, this has been done by a few others, e.g. Shu et al (2020), who found the observed Arctic September sea ice extent (SIE) declining trend between 1979 and 2014 is slightly underestimated in CMIP6 models, while the observed weak but significant upward trend of the Antarctic SIE is not captured.

📋 Methodology#

We compare sea ice concentrations from the CMIP6 historical experiment with that obtained from passive microwave satellite products from EUMETSAT OSI- SAF and ESA CCI. Time series of the evaluation metrics Integrated Ice Edge Error (IIEE) (Goessling et al, 2016; Henke et al, 2023), bias in extent and area, and the root mean square error (RMSE) in sea ice concentration are produced and plotted. All of these statistics are area-weighted averages. We consider all the CMIP6 models that output sea ice concentration in the historical experiment, but remove outliers by only plotting the interquartile limits (IQLs). Other authors have chosen different subsetting approaches - e.g. by scoring the models and only retaining the top-performing ones. We recommend users test as many models as possible, before deciding if a subset adequately represents the uncertainty for their application.

Decadal climatologies for both the models and observations are also produced and compared, for the months March, September and December.

The “Analysis and results” section is organised as follows:

1. Import libraries, set parameters and definitions of functions

2. Download and transform data

3. Results

        3.1 Time series of evaluation metrics

        Plot time series of the error metrics.

        3.2 Spatial distribution of errors

        This section has maps of climatologies and their biases for the Arctic minimum, Arctic maximum, Antarctic minimum, Antarctic maximum. We also plot maps for the Arctic December since this was an interesting month for the climate projections.

📈 Analysis and results#

1. Import libraries, set parameters and definitions of functions#

1.1 Import libraries#

Import the required libraries, including the EQC toolbox.

Hide code cell source
import datetime
import warnings
import calendar
import pandas as pd
import numpy as np
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import xarray as xr
from cmocean import cm

from c3s_eqc_automatic_quality_control import diagnostics, download, plot

plt.style.use("seaborn-v0_8-notebook")
warnings.filterwarnings("ignore", module="cf_xarray")
warnings.filterwarnings("ignore", module="numpy")

1.2 Set parameters#

  • Set the time period to be analysed with year_start and year_stop.

  • Set the concentration threshold sic_threshold for determining sea ice extent (we use 30% to be consistent with the ice edge product).

  • clim_months is a list of the months to plot the climatologies for.

  • decades_historical of starting years of decades to calculate climatologies for.

  • projections is a dictionary of projections to plot the climatology maps in.

  • map_slices is a dictionary of slices in the x and y directions (on observation grid) to zoom in on the climatology maps.

  • Set the regions to be analysed with the list regions.

  • Set the models to be evaluated with the list models.

Hide code cell source
# Time
year_start = 1970
year_stop = 2019
assert year_start >= 1970 and year_stop <= 2019
assert not (year_start % 10 or (year_stop + 1) % 10)

# Sea Ice Concentration Threshold
sic_threshold = 30  # %

# months to plot the climatologies for
clim_months = [3, 9, 12]

# starting years of decades to calculate climatologies for
decades_historical = range(1985, 2015, 10)

# projections to plot the climatology maps in
projections = {
    "Arctic" : ccrs.Stereographic(central_latitude=90.),
    "Antarctic" : ccrs.Stereographic(central_latitude=-90.),
}

# slices in x and y directions (on observation grid) to zoom in on the climatology maps
map_slices = {
    "Arctic": {'xc': slice(50, -100), 'yc': slice(50, -50)},
    "Antarctic": {'xc': slice(50, -50), 'yc': slice(50, -50)},
}

# Select regions
regions = ["northern_hemisphere", "southern_hemisphere"]
assert set(regions) <= {
    "northern_hemisphere",
    "southern_hemisphere",
}

# Choose CMIP6 historical models
models = [
    "access_cm2",
    "access_esm1_5",
    "cams_csm1_0",
    "canesm5",
    "canesm5_canoe",
    "cmcc_cm2_hr4",
    "cmcc_cm2_sr5",
    "cmcc_esm2",
    "cnrm_cm6_1",
    "cnrm_cm6_1_hr",
    "cnrm_esm2_1",
    "e3sm_1_0",
    "e3sm_1_1",
    "e3sm_1_1_eca",
    "ec_earth3_aerchem",
    "ec_earth3_cc",
    "ec_earth3_veg_lr",
    "fgoals_f3_l",
    "giss_e2_1_h",
    "hadgem3_gc31_ll",
    "hadgem3_gc31_mm",
    "inm_cm4_8",
    "inm_cm5_0",
    "ipsl_cm5a2_inca",
    "ipsl_cm6a_lr",
    "miroc6",
    "miroc_es2l",
    "mpi_esm1_2_hr",
    "mpi_esm1_2_lr",
    "mri_esm2_0",
    "nesm3",
    "norcpm1",
    "taiesm1",
    "ukesm1_0_ll",
]

1.3 Define requests for CDS data#

Define the download requests in the format needed by the EQC toolbox.

Hide code cell source
all_months = [f"{month:02d}" for month in range(1, 13)]

request_cmip6_historical = (
    "projections-cmip6",
    {
        "format": "zip",
        "temporal_resolution": "monthly",
        "experiment": "historical",
        "variable": "sea_ice_area_percentage_on_ocean_grid",
        "year": [
            str(year) for year in range(max(year_start, 1850), min(year_stop, 2014) + 1)
        ],
        "month": all_months,
    },
)

request_eumetsat = (
    "satellite-sea-ice-concentration",
    download.update_request_date(
        {
            "cdr_type": "cdr",
            "origin": "EUMETSAT OSI SAF",
            "sensor": "ssmis",
            "temporal_aggregation": "daily",
            "variable": "all",
            "version": "v2",
        },
        start=f"{max(year_start, 1979)}-01",
        stop=f"{min(year_stop, 2015)}-12",
        stringify_dates=True,
    ),
)

1.4 Functions to create the time series#

These functions are all applied to a single CMIP6 model.

  • interpolate_to_satellite_grid interpolates the model to the satellite grid.

  • get_monthly_interpolated_data is applied to both the model and satellite data, to take the monthly mean (the satellite data is daily, but this also forces all the models to have the same time coordinate) and interpolate to the satellite grid (only for model data). It also calculates the RMS error (only for satellite data).

  • get_satellite_data downloads the daily satellite data and applies get_monthly_interpolated_data to get the monthly mean and the RMS error in the data.

  • compare_model_vs_obs compares the sea ice concentration from a single CMIP6 model with satellite estimates, calculating error metrics like biases in concentration and extent, RMSE, and IIEE.

  • compute_sea_ice_evaluation_diagnostics downloads the satellite and model data and calls compare_model_vs_obs to get a time series of error metrics.

Hide code cell source
def interpolate_to_satellite_grid(obj, region, **regrid_kwargs):
    # Remove nan columns
    for dim in [dim for dim in obj.dims if "x" in dim or "lon" in dim]:
        for i in (0, -1):
            if obj.isel({dim: i}).isnull().all():
                obj = obj.drop_isel({dim: i})

    collection_id = "satellite-sea-ice-concentration"
    request = {
        "region": region,
        "version": "v2",
        "variable": "all",
        "format": "zip",
        "origin": "ESA CCI",
        "sensor": "amsr",
        "temporal_aggregation": "daily",
        "cdr_type": "cdr",
        "year": "2002",
        "month": "06",
        "day": "01",
    }
    grid_out = download.download_and_transform(collection_id, request).drop_dims("time")
    return diagnostics.regrid(obj, grid_out, **regrid_kwargs)
    

def get_monthly_interpolated_data(ds, add_stde, check_values, region, **regrid_kwargs):
    if add_stde:
        stde = ds.cf["sea_ice_area_fraction standard_error"]

    ds = ds.cf[["latitude", "longitude", "sea_ice_area_fraction"]]
    ds = ds.drop_dims(set(ds.dims) & {"vertices", "bnds"})

    if regrid_kwargs:
        ds = interpolate_to_satellite_grid(ds, region, **regrid_kwargs)

    ds = ds.sortby("time").resample(time="MS").mean()
    ds["time"].attrs["long_name"] = "time"

    if add_stde:
        with xr.set_options(keep_attrs=True):
            ds = ds.merge((stde**2).resample(time="MS").mean() ** (1 / 2))

    if check_values:
        mask = ds.cf["sea_ice_area_fraction"].notnull() & (
            ds.cf["sea_ice_area_fraction"] != 0
        )
        ds = ds.sel(time=mask.any(set(mask.dims) - {"time"}))
    return ds


def get_satellite_data(time, region):
    year_start = time.dt.year.min().values
    year_stop = time.dt.year.max().values

    common_request = {
        "cdr_type": "cdr",
        "variable": "all",
        "version": "v2",
        "region": region,
    }
    satellite_requests = {
        "ESA-CCI": download.update_request_date(
            common_request
            | {
                "origin": "ESA CCI",
                "sensor": "amsr",
                "temporal_aggregation": "daily",
            },
            start=f"{max(year_start, 2002)}-01",
            stop=f"{min(year_stop, 2017)}-12",
            stringify_dates=True,
        ),
        "EUMETSAT-OSI-SAF": download.update_request_date(
            common_request
            | {
                "origin": "EUMETSAT OSI SAF",
                "sensor": "ssmis",
                "temporal_aggregation": "daily",
            },
            start=f"{max(year_start, 1979)}-01",
            stop=f"{min(year_stop, 2015)}-12",
            stringify_dates=True,
        ),
    }

    datasets_satellite = {}
    for name, requests in satellite_requests.items():
        if not requests:
            continue
        print(f"{name=}")
        datasets_satellite[name] = download.download_and_transform(
            "satellite-sea-ice-concentration",
            requests,
            chunks={"year": 1},
            transform_func=get_monthly_interpolated_data,
            transform_func_kwargs={
                "add_stde": True,
                "check_values": True,
                "region": region,
            },
        )
    return datasets_satellite


def compare_model_vs_obs(ds, datasets_sat, sic_threshold, grid_cell_area):
    ds = ds.convert_calendar("standard", align_on="date")
    datasets_sat = {
        k: ds.convert_calendar("standard", align_on="date")
        for k, ds in datasets_sat.items()
    }

    grid_cell_area *= 1.0e-6  # 10^6 km2
    sic = ds.cf["sea_ice_area_fraction"]
    if sic.attrs.get("units", "") == "(0 - 1)":
        sic *= 100

    dims = ("xc", "yc")
    datasets = []
    for origin, ds_sat in datasets_sat.items():
        # Get variables
        sic_obs = ds_sat.cf["sea_ice_area_fraction"]
        sic_obs_err = ds_sat.cf["sea_ice_area_fraction standard_error"]
        sic_model = sic.sel(time=sic_obs["time"])

        # Compute useful variables
        sic_diff = sic_model - sic_obs
        over = ((sic_model > sic_threshold) & (sic_obs <= sic_threshold)).sum(dims)
        under = ((sic_model <= sic_threshold) & (sic_obs > sic_threshold)).sum(dims)

        # Compute output
        dataarrays = {}
        dataarrays["siconc_bias"] = sic_diff.mean(dims)
        dataarrays["siconc_bias"].attrs = {
            "standard_name": "sea_ice_concentration_bias",
            "units": "%",
            "long_name": "Sea ice concentration bias",
        }

        dataarrays["siconc_rmse"] = (sic_diff**2).mean(dim=dims) ** (1 / 2)
        dataarrays["siconc_rmse"].attrs = {
            "standard_name": "sea_ice_concentration_rmse",
            "units": "%",
            "long_name": "Sea ice concentration root mean square error",
        }

        dataarrays["rms_sic_obs_error"] = (sic_obs_err**2).mean(dims) ** (1 / 2)
        dataarrays["rms_sic_obs_error"].attrs = {
            "standard_name": "root_mean_square_sea_ice_concentration_observation_error",
            "units": "%",
            "long_name": "Root mean square sea ice concentration observation error",
        }

        dataarrays["iiee"] = (over + under) * grid_cell_area
        dataarrays["iiee"].attrs = {
            "standard_name": "integrated_ice_edge_error",
            "units": "$10^6$km$^2$",
            "long_name": "Integrated ice edge error",
        }

        dataarrays["siextent_bias"] = (over - under) * grid_cell_area
        dataarrays["siextent_bias"].attrs = {
            "standard_name": "sea_ice_extent_bias",
            "units": "$10^6$km$^2$",
            "long_name": "Sea ice extent bias",
        }

        datasets.append(xr.Dataset(dataarrays).expand_dims(origin=[origin]))
    return xr.concat(datasets, "origin") if datasets else xr.Dataset()
    

def compute_sea_ice_evaluation_diagnostics(
    ds, sic_threshold, region, **regrid_kwargs
):
    datasets_sat = get_satellite_data(ds["time"], region)
    ds = get_monthly_interpolated_data(
        ds, add_stde=False, check_values=False, region=region, **regrid_kwargs
    )
    return compare_model_vs_obs(
        ds, datasets_sat, sic_threshold, grid_cell_area=25**2)

1.5 Post-processing and plotting of the time series#

  • postprocess_dataset is applied after loading from the cache and renames some variables and dimensions for easier use.

  • plot_timeseries plots time series for each error metric (sea ice concentration bias and RMSE, sea ice extent bias and IIEE), and for each region (Arctic or Antarctic).

Hide code cell source
def postprocess_dataset(ds):
    ds = ds.rename(
        {
            "siconc_bias": "Bias in concentration",
            "siconc_rmse": "RMSE in concentration",
            "rms_sic_obs_error": "RMS obs error in concentration",
            "iiee": "IIEE",
            "siextent_bias": "Extent bias",
        }
    )
    ds["region"] = [
        {"northern_hemisphere": "Arctic", "southern_hemisphere": "Antarctic"}[region]
        for region in ds["region"].values
    ]
    return ds.compute()


def plot_timeseries(ds_cmip6, func=None, title=None, **kwargs):
    if func:
        ds_cmip6 = func(ds_cmip6, **kwargs)
    else:
        assert not kwargs, f"{func=} but {kwargs=}"
    # err_name = "RMS obs error"
    err_name = "RMS obs error in concentration"
    err_colors = ["lightgray", "darkgray"]
    ds_error = ds_cmip6[[err_name]].mean(dim="model")#should be the same for all models
    da_cmip6 = ds_cmip6.drop_vars([err_name]).to_array()

    # get median and interquartile limits to show the spread of the models
    da_median = da_cmip6.median(dim="model")
    da_iql = da_cmip6.quantile([1 / 4, 3 / 4], dim="model")

    # plot the median
    for i, (origin, da) in enumerate(da_median.groupby("origin")):
        kwargs = {
            "color": f"C{i}",
            "label": f"CMIP6 vs {origin} median",
        }
        if not i:
            facet = da.plot(
                row="variable", col="region", hue="origin", sharey=False, add_legend=False, **kwargs
            )
            facet.set_titles(template="{value}")
        else:
            for ax, sel_dict in zip(facet.axs.flatten(), facet.name_dicts.flatten()):
                ax.plot(da["time"], da.sel(sel_dict).values.flatten(), **kwargs)

        # Plot spread and add observation errors
        for ax, sel_dict in zip(facet.axs.flatten(), facet.name_dicts.flatten()):
            kwargs = {"color": f"C{i}"}
            
            da2 = da_iql.sel(sel_dict | {"origin": origin})
            ax.fill_between(
                da2["time"],
                da2.sel(quantile=1 / 4),
                da2.sel(quantile=3 / 4),
                alpha=0.4,
                label=f"CMIP6 vs {origin} IQL",
                zorder=2,
                **kwargs,
            )
            ax.grid(linestyle=":")

            if sel_dict["variable"] == "Bias in concentration":
                # add obs errors to plots
                da_err = ds_error.sel({"origin": origin, "region": sel_dict["region"]})[err_name]
                ax.fill_between(
                    da_err["time"],
                    - da_err,
                    da_err,
                    alpha=0.4,
                    label=f"RMS obs. error ({origin})",
                    zorder=i,
                    color=err_colors[i],
                )
            elif sel_dict["variable"] == "RMSE in concentration":
                # add obs errors to plots
                da_err = ds_error.sel({"origin": origin, "region": sel_dict["region"]})[err_name]
                ax.fill_between(
                    da_err["time"],
                    0,
                    da_err,
                    alpha=0.4,
                    label=f"RMS obs. error ({origin})",
                    zorder=i,
                    color=err_colors[i],
                )
            elif sel_dict["variable"] == "Extent bias":
                # fix time labels for last variable
                xticks = [pd.Timestamp(year,1,1) for year in range(1980,2016,5)]
                ax.set_xticks(xticks)
                xtick_labels = ax.get_xticklabels()
                ax.set_xticklabels(xtick_labels, rotation=45, rotation_mode="anchor",
                                   ha='right', va="center")

    # Edit axs
    for ax, sel_dict in zip(facet.axs[:, 0], facet.name_dicts[:, 0]):
        variable = sel_dict.pop("variable")
        da = ds_cmip6.sel(sel_dict)[variable]
        ax.set_ylabel(f"[{da.attrs['units']}]")
    facet.axs[0, -1].legend(bbox_to_anchor=(1.1, 1))
    for ax in facet.axs[-1,:]:
        ax.set_xlabel("")
    facet.axs[0, -1].legend(bbox_to_anchor=(1.1, 1))
    if title is not None:
        facet.fig.suptitle(title)

    return facet

1.6 Functions to compute the climatologies#

  • compute_monthly_climatology transforms a dataset (which can contain model or observation data) by sorting according to month, taking the time average, and interpolating to the satellite grid (model data only).

  • For a specified time interval, get_monthly_climatology_eumetsat downloads the satellite data and calculates the climatology for the selected region and months.

  • For a specified time interval, get_monthly_climatology_model downloads and calculates the climatology for the selected region, months and model.

  • get_monthly_climatologies_cmip6 calls get_monthly_climatology_model once for each model and calculates the climatology of the ensemble mean.

Hide code cell source
def compute_monthly_climatology(ds, **kwargs):
    
    def get_year(t):
        if hasattr(t, 'year'):
            return t.year
        return pd.Timestamp(t).year
    time = np.sort(ds["time"].values)
    year1 = get_year(time[0])
    year2 = get_year(time[-1])
    
    ds = (
        ds.groupby('time.month').mean(dim='time')
        .expand_dims(years=[f"{year1} - {year2}"])
    )
    if kwargs:
        ds = interpolate_to_satellite_grid(ds, **kwargs)
    return ds


def get_monthly_climatology_eumetsat(request, year1, year2, months, **kwargs):
    cid, req = request
    ndays = max([calendar.monthrange(year1, month)[1] for month in months])
    ds = download.download_and_transform(
        cid,
        req | {
            'year': [str(year) for year in range(year1, year2 + 1)],
            'month': [f'{month:02d}' for month in months],
            'day': [f'{day:02d}' for day in range(1, 1 + ndays)],
        },
        **kwargs,
    )
    return postprocess_climatology(ds)


def get_monthly_climatology_model(request, year1, year2, months, **kwargs):
    cid, req = request
    ds = download.download_and_transform(
        cid,
        req | {
            'year': [str(year) for year in range(year1, year2 + 1)],
            'month': [f'{month:02d}' for month in months],
        },
        **kwargs,
    )
    return postprocess_climatology(ds)


def get_monthly_climatologies_cmip6(models, request, year1, year2, months, **kwargs):
    """
    Loops over models and call get_monthly_climatology_model inside this function.
    Only returns the ensemble mean climatology to save memory.
    Also cleans the output as some models produce extra variables.
    """
    cid, req = request
    tmp_datasets = []
    for i, model in enumerate(models):
        print(f"{model = }, ({i}/{len(models)}")
        tmp_datasets += [
            get_monthly_climatology_model(
                (cid, req | {"model": model}),
                year1, year2, months, **kwargs
            ).expand_dims(model=[model])]
    ds = xr.merge(tmp_datasets).mean(dim="model")
    tmp_datasets = []

    # some models produce extra variables so drop any that are not needed
    vars_to_keep = [
        'xc',
        'yc',
        'years',
        'month',
        'model',
        'latitude',
        'longitude',
        'Sea ice concentration',
        ]
    return ds.drop_vars([v for v in ds.variables if v not in vars_to_keep])

1.7 Post-processing and plotting of climatologies#

  • postprocess_climatology is applied after loading a climatology from the cache. It makes the name and units of the sea ice concentration variable consistent between datasets.

  • make_sic_maps plots the climatology of the CMIP6 ensemble mean next to the observed climatology for the three decades considered (1985-1994, 1995-2004, 2005-2014), which are shown in rows.

  • make_sic_bias_maps plots the bias (CMIP6 ensemble mean minus the observations) for the same decades.

  • compare_sic_maps is a wrapper for those functions.

Hide code cell source
def postprocess_climatology(ds):
    # rename month
    ds['month'] = [calendar.month_name[i] for i in ds['month'].values]
    
    # rename SIC and convert to %
    sic = ds.cf["sea_ice_area_fraction"]
    old_name = sic.name
    new_name = "Sea ice concentration"
    sic.attrs["long_name"] = new_name
    sic_is_normalized = sic.attrs.get("units", "") == "(0 - 1)"
    sic.attrs["units"] = "%"
    ds[old_name] = 100 * sic if sic_is_normalized else sic
    ds = ds.rename({old_name: new_name})
    return ds.compute()


def make_sic_maps(ds_eumetsat, ds_cmip6, proj, sel_dict, isel_dict):
    get_sic = lambda ds: ds.sel(sel_dict).isel(**isel_dict)["Sea ice concentration"]
    sic_eumetsat = get_sic(ds_eumetsat)
    sic_cmip6 = get_sic(ds_cmip6).where(sic_eumetsat >= 0)
    tmp_datasets = [
        sic_cmip6.to_dataset().expand_dims(source=["CMIP6 ens. mean"]),
        sic_eumetsat.to_dataset().expand_dims(source=["EUMETSAT OSI-SAF"]),
    ]
    ds_compare_obs = xr.concat(tmp_datasets, "source")
    facet_grid = plot.projected_map(
        ds_compare_obs["Sea ice concentration"],
        projection=proj,
        show_stats=False,
        row=("years"),
        col=("source"),
        cmap=cm.ice,
        cbar_kwargs={'pad': .065, 'shrink': .6},
    )
    facet_grid.set_titles(template="{value}")
    plt.show()


def make_sic_bias_maps(ds_eumetsat, ds_cmip6, proj, sel_dict, isel_dict):
    get_sic = lambda ds: ds.sel(sel_dict).isel(**isel_dict)["Sea ice concentration"]
    sic_obs = get_sic(ds_eumetsat)
    get_bias = lambda sic, source: (
        (sic - sic_obs)
        .rename("Sea ice concentration bias")
        .to_dataset()
        .expand_dims(source=[source])
    )
    ds_bias = get_bias(get_sic(ds_cmip6), "CMIP6 ens. mean")
    facet_grid = plot.projected_map(
        ds_bias["Sea ice concentration bias"],
        projection=proj,
        show_stats=False,
        row=("years"),
        col=("source"),
        cmap=cm.balance,
        vmin=-50,
        vmax=50,
        cbar_kwargs={"pad" : .13, 'shrink': .6, 'extend': 'both'},
    )
    facet_grid.set_titles(template="{value}")
    plt.show()


def compare_sic_maps(datasets_eumetsat, datasets_cmip6, region, sel_dict, projections, map_slices, plot_func=make_sic_maps):
    plot_func(
        ds_eumetsat=datasets_eumetsat[region],
        ds_cmip6=datasets_cmip6[region],
        proj=projections[region],
        sel_dict=sel_dict,
        isel_dict=map_slices[region],
    )

2. Download and transform data#

Download and transform the data and save it to disk. The second time download_and_transform is run it will save time by loading the transformed data from the disk.

2.1 Set some parameters for downloading#

  • interpolation_kwargs sets some interpolation options.

  • io_kwargs sets some parameters to speed up the download process.

  • eval_kwargs contains all the options that will be passed to download.download_and_transform when creating the time series of evaluation metrics.

Hide code cell source
interpolation_kwargs = {
    "method": "nearest_s2d",
    "periodic": True,
    "ignore_degenerate": True,
}

# Parameters to speed up IO
io_kwargs = {
    "concat_dim": "time",
    "combine": "nested",
    "data_vars": "minimal",
    "coords": "minimal",
    "compat": "override",
    "drop_variables": ("type",),
}

eval_kwargs = io_kwargs | {
    "transform_func": compute_sea_ice_evaluation_diagnostics,
    "chunks": {"year": 10},
}

region_name_mapper = {'northern_hemisphere' : 'Arctic', 'southern_hemisphere': 'Antarctic'}

2.2 Download and transform CMIP6 data for time series#

For each model and region, we download the data, transform it with compute_sea_ice_evaluation_diagnostics, save it to disk and post-process with postprocess_dataset. The time series are then combined into one dataset with extra dimensions model and region.

Hide code cell source
# hide-output
datasets = []
for i,model in enumerate(models):
    for region in regions:
        print(f"{model = } ({i + 1}/{len(models)}), {region = }")
        # options to be passed to compute_sea_ice_evaluation_diagnostics
        transform_func_kwargs = interpolation_kwargs | {
            "sic_threshold": sic_threshold,
            "region": region,
        }
        ds = download.download_and_transform(
            request_cmip6_historical[0],
            request_cmip6_historical[1] | {"model": model},
            transform_func_kwargs=transform_func_kwargs,
            **eval_kwargs,
        )
        datasets.append(
            postprocess_dataset(ds.expand_dims(region=[region], model=[model]))
        )
ds_cmip6 = xr.merge(datasets)
del datasets

2.3 Download EUMETSAT OSI-SAF climatology#

For each region and decade, we download the climatology of satellite sea ice concentration (EUMETSAT OSI-SAF product).

Hide code cell source
# hide-output
cid, (req,) = request_eumetsat
eumetsat_clim_kwargs = io_kwargs | {
    "drop_variables": (
        "type",
        'total_standard_error',
        'Lambert_Azimuthal_Grid',
        'status_flag',
        'raw_ice_conc_values',
        'smearing_standard_error',
        'algorithm_standard_error',
    ),
    "transform_func": compute_monthly_climatology,
}
datasets_eumetsat = {}
tmp_datasets = []
for region in regions:
    for year1 in decades_historical:
        print(f"{region = }, {year1 = }")
        tmp_datasets += [get_monthly_climatology_eumetsat(
            (cid, req | {'region': region}), year1, year1 + 9, clim_months, **eumetsat_clim_kwargs,
        )]
    datasets_eumetsat[region_name_mapper[region]] = xr.merge(tmp_datasets)
    tmp_datasets = []

2.4 Download the mean climatology for the historical CMIP6 experiment#

We use all models in the historical experiment to get the ensemble mean and get the climatology for one region and decade at a time to see the spatial distribution of the bias when compared to satellite estimates, and to see how the bias develops with time.

Hide code cell source
# hide-output
datasets_cmip6 = {}
tmp_datasets = []
for region in regions:
    cmip6_clim_kwargs = dict(
        **io_kwargs,
        transform_func=compute_monthly_climatology,
        transform_func_kwargs=dict(region=region, **interpolation_kwargs),
    )
    for year1 in decades_historical:
        print(f"{region = }, {year1 = }")
        tmp_datasets += [get_monthly_climatologies_cmip6(
            models, request_cmip6_historical, year1, year1 + 9, clim_months, **cmip6_clim_kwargs,
        )]
    datasets_cmip6[region_name_mapper[region]] = xr.merge(tmp_datasets)
    tmp_datasets = []

# set some options for plotting the climatologies
map_kwargs = {
    "datasets_eumetsat": datasets_eumetsat,
    "datasets_cmip6": datasets_cmip6,
    "projections": projections,
    "map_slices": map_slices,
}

3. Results#

3.1 Time series of evaluation metrics#

The evalution metrics that will be the most important here are the IIEE and the bias in extent, as the determination of the presence or absence of ice from passive microwave is more reliable than the actual value of the concentration derived from it. In particular, there is systematic underestimation of the concentration. This is especially true in summer due to meltponds for example, but it is also true in winter where the EUMETSAT OSI-SAF product usually estimates about 90% concentration in the Arctic pack (when it is very close to 100%, as can be seen from SAR images for example). The ESA CCI product which has a smaller footprint usually estimates a higher concentration than EUMETSAF OSI-SAF in the Arctic pack. Nevertheless it is still interesting to look at the area-weighted bias and RMSE in concentration itself - for example an underestimation of concentration would generally be undesirable for a model. The plots show the median and inter-quartile limits (IQL) to show the spread of our ensemble of CMIP6 models (all those that output sea ice concentration in the historical experiment).

The IIEE for the CMIP6 models is quite high, being about \(2\times10^6\)km\(^2\) in the Arctic and twice this in the Antarctic. (Refer to the sea ice diagnostics notebook which shows the historical Arctic minimum extent is about \(8\times10^6\)km\(^2\).) The bias in Arctic extent is not too high, although it is quite variable. For Antarctica however it is quite high, underestimating it by about \(4\times10^6\)km\(^2\) by 2015 (growing from around \(1\times10^6\)km\(^2\) in 1980).

The bias for the concentration itself is quite low for CMIP6, being well within the limits of the observation error. There is a clear tendency for underestimation by CMIP6 in Antarctica though, with the bias dropping linearly. The RMS error for CMIP6 is also very high, indicating a different spatial distribution to the satellite data, since the bias was not too high.

We will take a look at the spatial distribution more closely in the next section.

Finally we note that when the ESA-CCI and EUMETSAT OSI-SAF products overlap in time (after 2002), the metrics for CMIP6 are quite similar.

Hide code cell source
_ = plot_timeseries(ds_cmip6)
../../_images/c160f065d0e7bd1e04b85f8a9b45720381063c56e37ac567722edf49db53df99.png

3.2 Spatial distribution of errors#

We compare 10-year-averaged maps from 1985-2015 which is roughly the overlap period between the historical experiment and the EUMETSAT OSI-SAF product. In the Arctic, the mean concentration is quite close to the mean from observations, but there is disagreement in the seas with seaonal ice - e.g. Greenland, Barents, Labrador, Bering and Okhotsk Seas and Hudson Bay, and (in the summer) off the Russian and Alaskan coasts. For the Antarctic there is systematic underestimation of concentration.

3.2.1 Arctic minimum#

Looking at the maps below for September, we can see there is a general underestimation in the pack ice and an overestimation in the MIZ and near the coasts. This pattern persists with time although the locations move as the Arctic ice cover shrinks with time. This may be a resolution effect, since roughly the same amount of ice (low bias in sea ice concentration) may just be being spread out over a larger area.

Hide code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "September"}, **map_kwargs)
../../_images/3d5a4c515550bc4d1530f9f6f3bd555af94476a7e4aeef2a38aa3f950b3c39b9.png
Hide code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "September"}, plot_func=make_sic_bias_maps, **map_kwargs)
../../_images/7518d79e6921062e84f688e99a2276b2dce5477f9ef36e609bc02f8c6283a6df.png

3.2.2 Arctic maximum#

Looking at the maps below for March, we can see the ice in the central Arctic is generally unbiased, but there are some areas where there is strong underestimation - the Bering Sea and the Sea of Okhotsk - and there is strong overestimation in the Greenland Sea and the North Atlantic Ocean near the entrance to Hudson Bay. Overall, the total ice area is roughly the same in CMIP6 and the satellite observations.

Hide code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "March"}, **map_kwargs)
../../_images/f5334cce5d76da89b11f1cf6b77db271e1ee2e4627d60dcf646eb2f3b32d4421.png
Hide code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "March"}, plot_func=make_sic_bias_maps, **map_kwargs)
../../_images/a0fcff5fc6ab519f155fe44f5248069dc33590d0069ddce710cd217dff7bfe14.png

3.2.3 Antarctic minimum#

There is consistently too little ice on the Atlantic side of Antarctica (in the Weddell, Bellingshausen and Amundsen Seas). The amount of sea ice in those areas is also decreasing with time, while the concentration from satellite is staying relatively constant. On the Pacific and Indian ocean sides there is more of a dipole pattern (too little at the coast and too much away from it), probably due to the effect of low resolution in the CMIP6 models.

Hide code cell source
compare_sic_maps(region="Antarctic", sel_dict={"month": "March"}, **map_kwargs)
../../_images/4eaac4ec8a3d220c6b5368e3a8259af8b1dbaa26fda747e25b92d91d44de83fc.png
Hide code cell source
compare_sic_maps(region="Antarctic", sel_dict={"month": "March"}, plot_func=make_sic_bias_maps, **map_kwargs)
../../_images/019b8cb68bf16e359d3002843a965b0807b46f09d60c4dbd191ea907f47ab387.png

3.2.4 Antarctic maximum#

For the month of September there is in general too little ice everywhere, although away from the coast at longitude about 140\(^\circ\)W in the Atlantic Ocean, and in the Indian Ocean the underestimation is most pronounced. The underestimation on those areas is also increasing with time, while the concentration from satellite is staying relatively constant in this month as well.

Hide code cell source
compare_sic_maps(region="Antarctic", sel_dict={"month": "September"}, **map_kwargs)
../../_images/d9ca018e52892bfd45c86e018f364f42afe22112caa3fcf0079320923a6f2a7d.png
Hide code cell source
compare_sic_maps(region="Antarctic", sel_dict={"month": "September"},
                 plot_func=make_sic_bias_maps, **map_kwargs)
../../_images/b242f0841221a4332dcc668ca1e0f0c7c4fa6130e5ecb7714838c8fa95fab246.png

3.2.5 Arctic December (relevant to the Arctic shipping route assessment)#

One of the other CMIP6 quality assessments looks at projections of accessibility of Arctic shipping routes, where sea ice in December is particularly interesting. Hence we also plot some maps for this month here. The CMIP6 ensemble mean is quite close to the satellite data in the Arctic Ocean, with the main feature of the bias maps being the underestimation in the Hudson Bay. Note that not all models that provide sea ice concentration in the historical experiment also provide it in the different projection experiments, so that adds some additional uncertainty to these conclusions.

Hide code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "December"}, **map_kwargs)
../../_images/b2036f324b045e87da9aaeeab8a3436a464305e62a88276f353cf3f87279e4dc.png
Hide code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "December"},
                 plot_func=make_sic_bias_maps, **map_kwargs)
../../_images/7eca8b6a4c52646f84853a874c9e60a3c1941c99f4ccdf272e5446a5e90371fe.png

ℹ️ If you want to know more#

Key resources#

Introductory sea ice materials:

Introductory CMIP6 materials:

Code libraries used:

References#

  1. Davy, R., and S. Outten (2020). The Arctic Surface Climate in CMIP6: Status and Developments since CMIP5. J. Climate, 33, 8047–8068, https://doi.org/10.1175/JCLI-D-19-0990.1.

  2. Frankignoul, C., L. Raillard, B. Ferster, and Y. Kwon (2024). Arctic September sea ice concentration biases in CMIP6 models and their relationships with other model variables. J. Climate, https://doi.org/10.1175/JCLI-D-23-0452.1, in press.

  3. Goessling, H. F., S. Tietsche, J. J. Day, E. Hawkins, and T. Jung (2016). Predictability of the Arctic sea ice edge, Geophys. Res. Lett., 43, 1642–1650, https://doi.org/10.1002/2015GL067232.

  4. Henke, M., F. Cassalho, T. Miesse, C. M. Ferreira, J. Zhang and T. M. Ravens (2023). Assessment of Arctic sea ice and surface climate conditions in nine CMIP6 climate models. Arctic, Antarctic, and Alpine Research, 55(1). https://doi.org/10.1080/15230430.2023.2271592

  5. Nie, Y., X. Lin, Q. Yang, J. Liu, D. Chen and P. Uotila (2023). Differences between the CMIP5 and CMIP6 Antarctic sea ice concentration budgets. Geophysical Research Letters, 50, e2023GL105265. https://doi.org/10.1029/2023GL105265

  6. Li, S., Y. Zhang, C. Chen, Y. Zhang, D. Xu, and S. Hu (2023). Assessment of Antarctic Sea Ice Cover in CMIP6 Prediction with Comparison to AMSR2 during 2015–2021. Remote Sensing 15(8): 2048. https://doi.org/10.3390/rs15082048.

  7. Roach, L. A., J. Dörr, C. R. Holmes, F. Massonnet, E. W. Blockley, D. Notz, T. Rackow, M. N. Raphael, S. P. O’Farrell, D. A. Bailey, and C. M. Bitz (2020). Antarctic sea ice area in CMIP6. Geophysical Research Letters, 47, e2019GL086729. https://doi.org/10.1029/2019GL086729

  8. Shu, Q., Q. Wang, Z. Song, F. Qiao, J. Zhao, M. Chu and X. Li (2020). Assessment of sea ice extent in CMIP6 with comparison to observations and CMIP5. Geophysical Research Letters, 47, e2020GL087965, https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2020GL087965

  9. SIMIP Community (2020). Arctic sea ice in CMIP6. Geophysical Research Letters, 47, e2019GL086749. https://doi.org/10.1029/2019GL086749

  10. Watts, M., W. Maslowski, Y. J. Lee, J. C. Kinney, and R. Osinski (2021). A Spatial Evaluation of Arctic Sea Ice and Regional Limitations in CMIP6 Historical Simulations. J. Climate, 34, 6399–6420, https://doi.org/10.1175/JCLI-D-20-0491.1.