5.1.7. Historical accuracy of sea ice extent in the CMIP6 projections#
Production date: 2024-05-31
Produced by: Timothy Williams, Nansen Environmental and Remote Sensing Center (NERSC)
🌍 Use case: Scientific assessment of sea ice extent in CMIP6 models to inform product creation#
❓ Quality assessment question#
How reliable are the historical estimates of the Arctic and Antarctic sea ice concentration and extent from the CMIP6 models?
📢 Quality assessment statement#
These are the key outcomes of this assessment
Bearing in mind that sea ice concentration estimates from passive microwave observations themselves are quite uncertain, we find that the errors between them and the CMIP6 models in the historical experiment can have significant biases and also have quite a large spread, so should be treated with some caution.
We find that the CMIP6 models generally underestimate the sea ice extent and area. Errors for the Antarctic are approximately double those in the Arctic. While the Arctic sea ice minima are generally quite accurate, the Arctic maxima are consistently underestimated, as are the Antarctic extrema.
At the time of the Arctic minimum, there is a general underestimation in the pack ice, but an overestimation in the marginal ice zone (MIZ) and near the coasts.
At the time of the Arctic maximum, the concentration in the pack is relatively unbiased, but there are some areas where there is strong underestimation (Bering Sea, Sea of Okhotsk)and overestimation (Greenland Sea, Labrador Sea).
Arctic December: one of the other CMIP6 quality assessments looks at projections of accessibility of Arctic shipping routes, where sea ice in December is particularly interesting. Therefore we also check the reliability of the December concentrations in the CMIP6 models. We find that the concentration in the pack is quite similar to the observations, although there is significant underestimation in Hudson Bay, and less pronounced underestimation in the Bering Sea. In later years, there is also slight underestimation off the Russian coast. The ice extent in the Greenland and Barents Sea is consistently overestimated.
At the time of the Antarctic minimum, there is strong underestimation in the Weddell, Bellingshausen and Amundsen Seas, while south of the Pacific and Indian Oceans, there is less bias. However, there there is too little ice at the coast and too much away from it.
At the time of the Antarctic maximum, there is in general too little ice everywhere, with the region away from the coast at longitude about 140W south of the Atlantic Ocean, and in the region south of the Indian Ocean having the most pronounced underestimation. The underestimation on those areas is also increasing with time, while the concentration from satellite is staying relatively constant in this month as well.
These results are generally consistent with analyses from other authors - for the Arctic by the SIMIP community (2020), Davy and Outten (2020), Shu et al (2020), Watts et al (2021), Henke et al (2023) and Frankignoul et al (2024); for the Antarctic by Roach et al (2020), Shu et al (2020), Nie et al (2023) and Li et al (2023).
We ourselves don’t consider time series of the sea ice minima themselves, but only plot climatologies of the multi-model mean to see how the differences between the models and the observations are distributed spatially. However, this has been done by a few others, e.g. Shu et al (2020), who found the observed Arctic September sea ice extent (SIE) declining trend between 1979 and 2014 is slightly underestimated in CMIP6 models, while the observed weak but significant upward trend of the Antarctic SIE is not captured.
📋 Methodology#
We compare sea ice concentrations from the CMIP6 historical experiment with that obtained from passive microwave satellite products from EUMETSAT OSI- SAF and ESA CCI. Time series of the evaluation metrics Integrated Ice Edge Error (IIEE) (Goessling et al, 2016; Henke et al, 2023), bias in extent and area, and the root mean square error (RMSE) in sea ice concentration are produced and plotted. All of these statistics are area-weighted averages. We consider all the CMIP6 models that output sea ice concentration in the historical experiment, but remove outliers by only plotting the interquartile limits (IQLs). Other authors have chosen different subsetting approaches - e.g. by scoring the models and only retaining the top-performing ones. We recommend users test as many models as possible, before deciding if a subset adequately represents the uncertainty for their application.
Decadal climatologies for both the models and observations are also produced and compared, for the months March, September and December.
The “Analysis and results” section is organised as follows:
1. Import libraries, set parameters and definitions of functions
2. Download and transform data
3.1 Time series of evaluation metrics
Plot time series of the error metrics.
3.2 Spatial distribution of errors
This section has maps of climatologies and their biases for the Arctic minimum, Arctic maximum, Antarctic minimum, Antarctic maximum. We also plot maps for the Arctic December since this was an interesting month for the climate projections.
📈 Analysis and results#
1. Import libraries, set parameters and definitions of functions#
1.1 Import libraries#
Import the required libraries, including the EQC toolbox.
Show code cell source
import datetime
import warnings
import calendar
import pandas as pd
import numpy as np
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import xarray as xr
from cmocean import cm
from c3s_eqc_automatic_quality_control import diagnostics, download, plot
plt.style.use("seaborn-v0_8-notebook")
warnings.filterwarnings("ignore", module="cf_xarray")
warnings.filterwarnings("ignore", module="numpy")
1.2 Set parameters#
Set the time period to be analysed with
year_start
andyear_stop
.Set the concentration threshold
sic_threshold
for determining sea ice extent (we use 30% to be consistent with the ice edge product).clim_months
is a list of the months to plot the climatologies for.decades_historical
of starting years of decades to calculate climatologies for.projections
is a dictionary of projections to plot the climatology maps in.map_slices
is a dictionary of slices in the x and y directions (on observation grid) to zoom in on the climatology maps.Set the regions to be analysed with the list
regions
.Set the models to be evaluated with the list
models
.
Show code cell source
# Time
year_start = 1970
year_stop = 2019
assert year_start >= 1970 and year_stop <= 2019
assert not (year_start % 10 or (year_stop + 1) % 10)
# Sea Ice Concentration Threshold
sic_threshold = 30 # %
# months to plot the climatologies for
clim_months = [3, 9, 12]
# starting years of decades to calculate climatologies for
decades_historical = range(1985, 2015, 10)
# projections to plot the climatology maps in
projections = {
"Arctic" : ccrs.Stereographic(central_latitude=90.),
"Antarctic" : ccrs.Stereographic(central_latitude=-90.),
}
# slices in x and y directions (on observation grid) to zoom in on the climatology maps
map_slices = {
"Arctic": {'xc': slice(50, -100), 'yc': slice(50, -50)},
"Antarctic": {'xc': slice(50, -50), 'yc': slice(50, -50)},
}
# Select regions
regions = ["northern_hemisphere", "southern_hemisphere"]
assert set(regions) <= {
"northern_hemisphere",
"southern_hemisphere",
}
# Choose CMIP6 historical models
models = [
"access_cm2",
"access_esm1_5",
"cams_csm1_0",
"canesm5",
"canesm5_canoe",
"cmcc_cm2_hr4",
"cmcc_cm2_sr5",
"cmcc_esm2",
"cnrm_cm6_1",
"cnrm_cm6_1_hr",
"cnrm_esm2_1",
"e3sm_1_0",
"e3sm_1_1",
"e3sm_1_1_eca",
"ec_earth3_aerchem",
"ec_earth3_cc",
"ec_earth3_veg_lr",
"fgoals_f3_l",
"giss_e2_1_h",
"hadgem3_gc31_ll",
"hadgem3_gc31_mm",
"inm_cm4_8",
"inm_cm5_0",
"ipsl_cm5a2_inca",
"ipsl_cm6a_lr",
"miroc6",
"miroc_es2l",
"mpi_esm1_2_hr",
"mpi_esm1_2_lr",
"mri_esm2_0",
"nesm3",
"norcpm1",
"taiesm1",
"ukesm1_0_ll",
]
1.3 Define requests for CDS data#
Define the download requests in the format needed by the EQC toolbox.
Show code cell source
all_months = [f"{month:02d}" for month in range(1, 13)]
request_cmip6_historical = (
"projections-cmip6",
{
"format": "zip",
"temporal_resolution": "monthly",
"experiment": "historical",
"variable": "sea_ice_area_percentage_on_ocean_grid",
"year": [
str(year) for year in range(max(year_start, 1850), min(year_stop, 2014) + 1)
],
"month": all_months,
},
)
request_eumetsat = (
"satellite-sea-ice-concentration",
download.update_request_date(
{
"cdr_type": "cdr",
"origin": "EUMETSAT OSI SAF",
"sensor": "ssmis",
"temporal_aggregation": "daily",
"variable": "all",
"version": "v2",
},
start=f"{max(year_start, 1979)}-01",
stop=f"{min(year_stop, 2015)}-12",
stringify_dates=True,
),
)
1.4 Functions to create the time series#
These functions are all applied to a single CMIP6 model.
interpolate_to_satellite_grid
interpolates the model to the satellite grid.get_monthly_interpolated_data
is applied to both the model and satellite data, to take the monthly mean (the satellite data is daily, but this also forces all the models to have the same time coordinate) and interpolate to the satellite grid (only for model data). It also calculates the RMS error (only for satellite data).get_satellite_data
downloads the daily satellite data and appliesget_monthly_interpolated_data
to get the monthly mean and the RMS error in the data.compare_model_vs_obs
compares the sea ice concentration from a single CMIP6 model with satellite estimates, calculating error metrics like biases in concentration and extent, RMSE, and IIEE.compute_sea_ice_evaluation_diagnostics
downloads the satellite and model data and callscompare_model_vs_obs
to get a time series of error metrics.
Show code cell source
def interpolate_to_satellite_grid(obj, region, **regrid_kwargs):
# Remove nan columns
for dim in [dim for dim in obj.dims if "x" in dim or "lon" in dim]:
for i in (0, -1):
if obj.isel({dim: i}).isnull().all():
obj = obj.drop_isel({dim: i})
collection_id = "satellite-sea-ice-concentration"
request = {
"region": region,
"version": "v2",
"variable": "all",
"format": "zip",
"origin": "ESA CCI",
"sensor": "amsr",
"temporal_aggregation": "daily",
"cdr_type": "cdr",
"year": "2002",
"month": "06",
"day": "01",
}
grid_out = download.download_and_transform(collection_id, request).drop_dims("time")
return diagnostics.regrid(obj, grid_out, **regrid_kwargs)
def get_monthly_interpolated_data(ds, add_stde, check_values, region, **regrid_kwargs):
if add_stde:
stde = ds.cf["sea_ice_area_fraction standard_error"]
ds = ds.cf[["latitude", "longitude", "sea_ice_area_fraction"]]
ds = ds.drop_dims(set(ds.dims) & {"vertices", "bnds"})
if regrid_kwargs:
ds = interpolate_to_satellite_grid(ds, region, **regrid_kwargs)
ds = ds.sortby("time").resample(time="MS").mean()
ds["time"].attrs["long_name"] = "time"
if add_stde:
with xr.set_options(keep_attrs=True):
ds = ds.merge((stde**2).resample(time="MS").mean() ** (1 / 2))
if check_values:
mask = ds.cf["sea_ice_area_fraction"].notnull() & (
ds.cf["sea_ice_area_fraction"] != 0
)
ds = ds.sel(time=mask.any(set(mask.dims) - {"time"}))
return ds
def get_satellite_data(time, region):
year_start = time.dt.year.min().values
year_stop = time.dt.year.max().values
common_request = {
"cdr_type": "cdr",
"variable": "all",
"version": "v2",
"region": region,
}
satellite_requests = {
"ESA-CCI": download.update_request_date(
common_request
| {
"origin": "ESA CCI",
"sensor": "amsr",
"temporal_aggregation": "daily",
},
start=f"{max(year_start, 2002)}-01",
stop=f"{min(year_stop, 2017)}-12",
stringify_dates=True,
),
"EUMETSAT-OSI-SAF": download.update_request_date(
common_request
| {
"origin": "EUMETSAT OSI SAF",
"sensor": "ssmis",
"temporal_aggregation": "daily",
},
start=f"{max(year_start, 1979)}-01",
stop=f"{min(year_stop, 2015)}-12",
stringify_dates=True,
),
}
datasets_satellite = {}
for name, requests in satellite_requests.items():
if not requests:
continue
print(f"{name=}")
datasets_satellite[name] = download.download_and_transform(
"satellite-sea-ice-concentration",
requests,
chunks={"year": 1},
transform_func=get_monthly_interpolated_data,
transform_func_kwargs={
"add_stde": True,
"check_values": True,
"region": region,
},
)
return datasets_satellite
def compare_model_vs_obs(ds, datasets_sat, sic_threshold, grid_cell_area):
ds = ds.convert_calendar("standard", align_on="date")
datasets_sat = {
k: ds.convert_calendar("standard", align_on="date")
for k, ds in datasets_sat.items()
}
grid_cell_area *= 1.0e-6 # 10^6 km2
sic = ds.cf["sea_ice_area_fraction"]
if sic.attrs.get("units", "") == "(0 - 1)":
sic *= 100
dims = ("xc", "yc")
datasets = []
for origin, ds_sat in datasets_sat.items():
# Get variables
sic_obs = ds_sat.cf["sea_ice_area_fraction"]
sic_obs_err = ds_sat.cf["sea_ice_area_fraction standard_error"]
sic_model = sic.sel(time=sic_obs["time"])
# Compute useful variables
sic_diff = sic_model - sic_obs
over = ((sic_model > sic_threshold) & (sic_obs <= sic_threshold)).sum(dims)
under = ((sic_model <= sic_threshold) & (sic_obs > sic_threshold)).sum(dims)
# Compute output
dataarrays = {}
dataarrays["siconc_bias"] = sic_diff.mean(dims)
dataarrays["siconc_bias"].attrs = {
"standard_name": "sea_ice_concentration_bias",
"units": "%",
"long_name": "Sea ice concentration bias",
}
dataarrays["siconc_rmse"] = (sic_diff**2).mean(dim=dims) ** (1 / 2)
dataarrays["siconc_rmse"].attrs = {
"standard_name": "sea_ice_concentration_rmse",
"units": "%",
"long_name": "Sea ice concentration root mean square error",
}
dataarrays["rms_sic_obs_error"] = (sic_obs_err**2).mean(dims) ** (1 / 2)
dataarrays["rms_sic_obs_error"].attrs = {
"standard_name": "root_mean_square_sea_ice_concentration_observation_error",
"units": "%",
"long_name": "Root mean square sea ice concentration observation error",
}
dataarrays["iiee"] = (over + under) * grid_cell_area
dataarrays["iiee"].attrs = {
"standard_name": "integrated_ice_edge_error",
"units": "$10^6$km$^2$",
"long_name": "Integrated ice edge error",
}
dataarrays["siextent_bias"] = (over - under) * grid_cell_area
dataarrays["siextent_bias"].attrs = {
"standard_name": "sea_ice_extent_bias",
"units": "$10^6$km$^2$",
"long_name": "Sea ice extent bias",
}
datasets.append(xr.Dataset(dataarrays).expand_dims(origin=[origin]))
return xr.concat(datasets, "origin") if datasets else xr.Dataset()
def compute_sea_ice_evaluation_diagnostics(
ds, sic_threshold, region, **regrid_kwargs
):
datasets_sat = get_satellite_data(ds["time"], region)
ds = get_monthly_interpolated_data(
ds, add_stde=False, check_values=False, region=region, **regrid_kwargs
)
return compare_model_vs_obs(
ds, datasets_sat, sic_threshold, grid_cell_area=25**2)
1.5 Post-processing and plotting of the time series#
postprocess_dataset
is applied after loading from the cache and renames some variables and dimensions for easier use.plot_timeseries
plots time series for each error metric (sea ice concentration bias and RMSE, sea ice extent bias and IIEE), and for each region (Arctic or Antarctic).
Show code cell source
def postprocess_dataset(ds):
ds = ds.rename(
{
"siconc_bias": "Bias in concentration",
"siconc_rmse": "RMSE in concentration",
"rms_sic_obs_error": "RMS obs error in concentration",
"iiee": "IIEE",
"siextent_bias": "Extent bias",
}
)
ds["region"] = [
{"northern_hemisphere": "Arctic", "southern_hemisphere": "Antarctic"}[region]
for region in ds["region"].values
]
return ds.compute()
def plot_timeseries(ds_cmip6, func=None, title=None, **kwargs):
if func:
ds_cmip6 = func(ds_cmip6, **kwargs)
else:
assert not kwargs, f"{func=} but {kwargs=}"
# err_name = "RMS obs error"
err_name = "RMS obs error in concentration"
err_colors = ["lightgray", "darkgray"]
ds_error = ds_cmip6[[err_name]].mean(dim="model")#should be the same for all models
da_cmip6 = ds_cmip6.drop_vars([err_name]).to_array()
# get median and interquartile limits to show the spread of the models
da_median = da_cmip6.median(dim="model")
da_iql = da_cmip6.quantile([1 / 4, 3 / 4], dim="model")
# plot the median
for i, (origin, da) in enumerate(da_median.groupby("origin")):
kwargs = {
"color": f"C{i}",
"label": f"CMIP6 vs {origin} median",
}
if not i:
facet = da.plot(
row="variable", col="region", hue="origin", sharey=False, add_legend=False, **kwargs
)
facet.set_titles(template="{value}")
else:
for ax, sel_dict in zip(facet.axs.flatten(), facet.name_dicts.flatten()):
ax.plot(da["time"], da.sel(sel_dict).values.flatten(), **kwargs)
# Plot spread and add observation errors
for ax, sel_dict in zip(facet.axs.flatten(), facet.name_dicts.flatten()):
kwargs = {"color": f"C{i}"}
da2 = da_iql.sel(sel_dict | {"origin": origin})
ax.fill_between(
da2["time"],
da2.sel(quantile=1 / 4),
da2.sel(quantile=3 / 4),
alpha=0.4,
label=f"CMIP6 vs {origin} IQL",
zorder=2,
**kwargs,
)
ax.grid(linestyle=":")
if sel_dict["variable"] == "Bias in concentration":
# add obs errors to plots
da_err = ds_error.sel({"origin": origin, "region": sel_dict["region"]})[err_name]
ax.fill_between(
da_err["time"],
- da_err,
da_err,
alpha=0.4,
label=f"RMS obs. error ({origin})",
zorder=i,
color=err_colors[i],
)
elif sel_dict["variable"] == "RMSE in concentration":
# add obs errors to plots
da_err = ds_error.sel({"origin": origin, "region": sel_dict["region"]})[err_name]
ax.fill_between(
da_err["time"],
0,
da_err,
alpha=0.4,
label=f"RMS obs. error ({origin})",
zorder=i,
color=err_colors[i],
)
elif sel_dict["variable"] == "Extent bias":
# fix time labels for last variable
xticks = [pd.Timestamp(year,1,1) for year in range(1980,2016,5)]
ax.set_xticks(xticks)
xtick_labels = ax.get_xticklabels()
ax.set_xticklabels(xtick_labels, rotation=45, rotation_mode="anchor",
ha='right', va="center")
# Edit axs
for ax, sel_dict in zip(facet.axs[:, 0], facet.name_dicts[:, 0]):
variable = sel_dict.pop("variable")
da = ds_cmip6.sel(sel_dict)[variable]
ax.set_ylabel(f"[{da.attrs['units']}]")
facet.axs[0, -1].legend(bbox_to_anchor=(1.1, 1))
for ax in facet.axs[-1,:]:
ax.set_xlabel("")
facet.axs[0, -1].legend(bbox_to_anchor=(1.1, 1))
if title is not None:
facet.fig.suptitle(title)
return facet
1.6 Functions to compute the climatologies#
compute_monthly_climatology
transforms a dataset (which can contain model or observation data) by sorting according to month, taking the time average, and interpolating to the satellite grid (model data only).For a specified time interval,
get_monthly_climatology_eumetsat
downloads the satellite data and calculates the climatology for the selected region and months.For a specified time interval,
get_monthly_climatology_model
downloads and calculates the climatology for the selected region, months and model.get_monthly_climatologies_cmip6
callsget_monthly_climatology_model
once for each model and calculates the climatology of the ensemble mean.
Show code cell source
def compute_monthly_climatology(ds, **kwargs):
def get_year(t):
if hasattr(t, 'year'):
return t.year
return pd.Timestamp(t).year
time = np.sort(ds["time"].values)
year1 = get_year(time[0])
year2 = get_year(time[-1])
ds = (
ds.groupby('time.month').mean(dim='time')
.expand_dims(years=[f"{year1} - {year2}"])
)
if kwargs:
ds = interpolate_to_satellite_grid(ds, **kwargs)
return ds
def get_monthly_climatology_eumetsat(request, year1, year2, months, **kwargs):
cid, req = request
ndays = max([calendar.monthrange(year1, month)[1] for month in months])
ds = download.download_and_transform(
cid,
req | {
'year': [str(year) for year in range(year1, year2 + 1)],
'month': [f'{month:02d}' for month in months],
'day': [f'{day:02d}' for day in range(1, 1 + ndays)],
},
**kwargs,
)
return postprocess_climatology(ds)
def get_monthly_climatology_model(request, year1, year2, months, **kwargs):
cid, req = request
ds = download.download_and_transform(
cid,
req | {
'year': [str(year) for year in range(year1, year2 + 1)],
'month': [f'{month:02d}' for month in months],
},
**kwargs,
)
return postprocess_climatology(ds)
def get_monthly_climatologies_cmip6(models, request, year1, year2, months, **kwargs):
"""
Loops over models and call get_monthly_climatology_model inside this function.
Only returns the ensemble mean climatology to save memory.
Also cleans the output as some models produce extra variables.
"""
cid, req = request
tmp_datasets = []
for i, model in enumerate(models):
print(f"{model = }, ({i}/{len(models)}")
tmp_datasets += [
get_monthly_climatology_model(
(cid, req | {"model": model}),
year1, year2, months, **kwargs
).expand_dims(model=[model])]
ds = xr.merge(tmp_datasets).mean(dim="model")
tmp_datasets = []
# some models produce extra variables so drop any that are not needed
vars_to_keep = [
'xc',
'yc',
'years',
'month',
'model',
'latitude',
'longitude',
'Sea ice concentration',
]
return ds.drop_vars([v for v in ds.variables if v not in vars_to_keep])
1.7 Post-processing and plotting of climatologies#
postprocess_climatology
is applied after loading a climatology from the cache. It makes the name and units of the sea ice concentration variable consistent between datasets.make_sic_maps
plots the climatology of the CMIP6 ensemble mean next to the observed climatology for the three decades considered (1985-1994, 1995-2004, 2005-2014), which are shown in rows.make_sic_bias_maps
plots the bias (CMIP6 ensemble mean minus the observations) for the same decades.compare_sic_maps
is a wrapper for those functions.
Show code cell source
def postprocess_climatology(ds):
# rename month
ds['month'] = [calendar.month_name[i] for i in ds['month'].values]
# rename SIC and convert to %
sic = ds.cf["sea_ice_area_fraction"]
old_name = sic.name
new_name = "Sea ice concentration"
sic.attrs["long_name"] = new_name
sic_is_normalized = sic.attrs.get("units", "") == "(0 - 1)"
sic.attrs["units"] = "%"
ds[old_name] = 100 * sic if sic_is_normalized else sic
ds = ds.rename({old_name: new_name})
return ds.compute()
def make_sic_maps(ds_eumetsat, ds_cmip6, proj, sel_dict, isel_dict):
get_sic = lambda ds: ds.sel(sel_dict).isel(**isel_dict)["Sea ice concentration"]
sic_eumetsat = get_sic(ds_eumetsat)
sic_cmip6 = get_sic(ds_cmip6).where(sic_eumetsat >= 0)
tmp_datasets = [
sic_cmip6.to_dataset().expand_dims(source=["CMIP6 ens. mean"]),
sic_eumetsat.to_dataset().expand_dims(source=["EUMETSAT OSI-SAF"]),
]
ds_compare_obs = xr.concat(tmp_datasets, "source")
facet_grid = plot.projected_map(
ds_compare_obs["Sea ice concentration"],
projection=proj,
show_stats=False,
row=("years"),
col=("source"),
cmap=cm.ice,
cbar_kwargs={'pad': .065, 'shrink': .6},
)
facet_grid.set_titles(template="{value}")
plt.show()
def make_sic_bias_maps(ds_eumetsat, ds_cmip6, proj, sel_dict, isel_dict):
get_sic = lambda ds: ds.sel(sel_dict).isel(**isel_dict)["Sea ice concentration"]
sic_obs = get_sic(ds_eumetsat)
get_bias = lambda sic, source: (
(sic - sic_obs)
.rename("Sea ice concentration bias")
.to_dataset()
.expand_dims(source=[source])
)
ds_bias = get_bias(get_sic(ds_cmip6), "CMIP6 ens. mean")
facet_grid = plot.projected_map(
ds_bias["Sea ice concentration bias"],
projection=proj,
show_stats=False,
row=("years"),
col=("source"),
cmap=cm.balance,
vmin=-50,
vmax=50,
cbar_kwargs={"pad" : .13, 'shrink': .6, 'extend': 'both'},
)
facet_grid.set_titles(template="{value}")
plt.show()
def compare_sic_maps(datasets_eumetsat, datasets_cmip6, region, sel_dict, projections, map_slices, plot_func=make_sic_maps):
plot_func(
ds_eumetsat=datasets_eumetsat[region],
ds_cmip6=datasets_cmip6[region],
proj=projections[region],
sel_dict=sel_dict,
isel_dict=map_slices[region],
)
2. Download and transform data#
Download and transform the data and save it to disk. The second time download_and_transform
is run it will save time by loading the transformed data from the disk.
2.1 Set some parameters for downloading#
interpolation_kwargs
sets some interpolation options.io_kwargs
sets some parameters to speed up the download process.eval_kwargs
contains all the options that will be passed todownload.download_and_transform
when creating the time series of evaluation metrics.
Show code cell source
interpolation_kwargs = {
"method": "nearest_s2d",
"periodic": True,
"ignore_degenerate": True,
}
# Parameters to speed up IO
io_kwargs = {
"concat_dim": "time",
"combine": "nested",
"data_vars": "minimal",
"coords": "minimal",
"compat": "override",
"drop_variables": ("type",),
}
eval_kwargs = io_kwargs | {
"transform_func": compute_sea_ice_evaluation_diagnostics,
"chunks": {"year": 10},
}
region_name_mapper = {'northern_hemisphere' : 'Arctic', 'southern_hemisphere': 'Antarctic'}
2.2 Download and transform CMIP6 data for time series#
For each model and region, we download the data, transform it with compute_sea_ice_evaluation_diagnostics
, save it to disk and post-process with postprocess_dataset
. The time series are then combined into one dataset with extra dimensions model
and region
.
Show code cell source
# hide-output
datasets = []
for i,model in enumerate(models):
for region in regions:
print(f"{model = } ({i + 1}/{len(models)}), {region = }")
# options to be passed to compute_sea_ice_evaluation_diagnostics
transform_func_kwargs = interpolation_kwargs | {
"sic_threshold": sic_threshold,
"region": region,
}
ds = download.download_and_transform(
request_cmip6_historical[0],
request_cmip6_historical[1] | {"model": model},
transform_func_kwargs=transform_func_kwargs,
**eval_kwargs,
)
datasets.append(
postprocess_dataset(ds.expand_dims(region=[region], model=[model]))
)
ds_cmip6 = xr.merge(datasets)
del datasets
2.3 Download EUMETSAT OSI-SAF climatology#
For each region and decade, we download the climatology of satellite sea ice concentration (EUMETSAT OSI-SAF product).
Show code cell source
# hide-output
cid, (req,) = request_eumetsat
eumetsat_clim_kwargs = io_kwargs | {
"drop_variables": (
"type",
'total_standard_error',
'Lambert_Azimuthal_Grid',
'status_flag',
'raw_ice_conc_values',
'smearing_standard_error',
'algorithm_standard_error',
),
"transform_func": compute_monthly_climatology,
}
datasets_eumetsat = {}
tmp_datasets = []
for region in regions:
for year1 in decades_historical:
print(f"{region = }, {year1 = }")
tmp_datasets += [get_monthly_climatology_eumetsat(
(cid, req | {'region': region}), year1, year1 + 9, clim_months, **eumetsat_clim_kwargs,
)]
datasets_eumetsat[region_name_mapper[region]] = xr.merge(tmp_datasets)
tmp_datasets = []
2.4 Download the mean climatology for the historical CMIP6 experiment#
We use all models in the historical experiment to get the ensemble mean and get the climatology for one region and decade at a time to see the spatial distribution of the bias when compared to satellite estimates, and to see how the bias develops with time.
Show code cell source
# hide-output
datasets_cmip6 = {}
tmp_datasets = []
for region in regions:
cmip6_clim_kwargs = dict(
**io_kwargs,
transform_func=compute_monthly_climatology,
transform_func_kwargs=dict(region=region, **interpolation_kwargs),
)
for year1 in decades_historical:
print(f"{region = }, {year1 = }")
tmp_datasets += [get_monthly_climatologies_cmip6(
models, request_cmip6_historical, year1, year1 + 9, clim_months, **cmip6_clim_kwargs,
)]
datasets_cmip6[region_name_mapper[region]] = xr.merge(tmp_datasets)
tmp_datasets = []
# set some options for plotting the climatologies
map_kwargs = {
"datasets_eumetsat": datasets_eumetsat,
"datasets_cmip6": datasets_cmip6,
"projections": projections,
"map_slices": map_slices,
}
3. Results#
3.1 Time series of evaluation metrics#
The evalution metrics that will be the most important here are the IIEE and the bias in extent, as the determination of the presence or absence of ice from passive microwave is more reliable than the actual value of the concentration derived from it. In particular, there is systematic underestimation of the concentration. This is especially true in summer due to meltponds for example, but it is also true in winter where the EUMETSAT OSI-SAF product usually estimates about 90% concentration in the Arctic pack (when it is very close to 100%, as can be seen from SAR images for example). The ESA CCI product which has a smaller footprint usually estimates a higher concentration than EUMETSAF OSI-SAF in the Arctic pack. Nevertheless it is still interesting to look at the area-weighted bias and RMSE in concentration itself - for example an underestimation of concentration would generally be undesirable for a model. The plots show the median and inter-quartile limits (IQL) to show the spread of our ensemble of CMIP6 models (all those that output sea ice concentration in the historical experiment).
The IIEE for the CMIP6 models is quite high, being about \(2\times10^6\)km\(^2\) in the Arctic and twice this in the Antarctic. (Refer to the sea ice diagnostics notebook which shows the historical Arctic minimum extent is about \(8\times10^6\)km\(^2\).) The bias in Arctic extent is not too high, although it is quite variable. For Antarctica however it is quite high, underestimating it by about \(4\times10^6\)km\(^2\) by 2015 (growing from around \(1\times10^6\)km\(^2\) in 1980).
The bias for the concentration itself is quite low for CMIP6, being well within the limits of the observation error. There is a clear tendency for underestimation by CMIP6 in Antarctica though, with the bias dropping linearly. The RMS error for CMIP6 is also very high, indicating a different spatial distribution to the satellite data, since the bias was not too high.
We will take a look at the spatial distribution more closely in the next section.
Finally we note that when the ESA-CCI and EUMETSAT OSI-SAF products overlap in time (after 2002), the metrics for CMIP6 are quite similar.
Show code cell source
_ = plot_timeseries(ds_cmip6)

3.2 Spatial distribution of errors#
We compare 10-year-averaged maps from 1985-2015 which is roughly the overlap period between the historical experiment and the EUMETSAT OSI-SAF product. In the Arctic, the mean concentration is quite close to the mean from observations, but there is disagreement in the seas with seaonal ice - e.g. Greenland, Barents, Labrador, Bering and Okhotsk Seas and Hudson Bay, and (in the summer) off the Russian and Alaskan coasts. For the Antarctic there is systematic underestimation of concentration.
3.2.1 Arctic minimum#
Looking at the maps below for September, we can see there is a general underestimation in the pack ice and an overestimation in the MIZ and near the coasts. This pattern persists with time although the locations move as the Arctic ice cover shrinks with time. This may be a resolution effect, since roughly the same amount of ice (low bias in sea ice concentration) may just be being spread out over a larger area.
Show code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "September"}, **map_kwargs)

Show code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "September"}, plot_func=make_sic_bias_maps, **map_kwargs)

3.2.2 Arctic maximum#
Looking at the maps below for March, we can see the ice in the central Arctic is generally unbiased, but there are some areas where there is strong underestimation - the Bering Sea and the Sea of Okhotsk - and there is strong overestimation in the Greenland Sea and the North Atlantic Ocean near the entrance to Hudson Bay. Overall, the total ice area is roughly the same in CMIP6 and the satellite observations.
Show code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "March"}, **map_kwargs)

Show code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "March"}, plot_func=make_sic_bias_maps, **map_kwargs)

3.2.3 Antarctic minimum#
There is consistently too little ice on the Atlantic side of Antarctica (in the Weddell, Bellingshausen and Amundsen Seas). The amount of sea ice in those areas is also decreasing with time, while the concentration from satellite is staying relatively constant. On the Pacific and Indian ocean sides there is more of a dipole pattern (too little at the coast and too much away from it), probably due to the effect of low resolution in the CMIP6 models.
Show code cell source
compare_sic_maps(region="Antarctic", sel_dict={"month": "March"}, **map_kwargs)

Show code cell source
compare_sic_maps(region="Antarctic", sel_dict={"month": "March"}, plot_func=make_sic_bias_maps, **map_kwargs)

3.2.4 Antarctic maximum#
For the month of September there is in general too little ice everywhere, although away from the coast at longitude about 140\(^\circ\)W in the Atlantic Ocean, and in the Indian Ocean the underestimation is most pronounced. The underestimation on those areas is also increasing with time, while the concentration from satellite is staying relatively constant in this month as well.
Show code cell source
compare_sic_maps(region="Antarctic", sel_dict={"month": "September"}, **map_kwargs)

Show code cell source
compare_sic_maps(region="Antarctic", sel_dict={"month": "September"},
plot_func=make_sic_bias_maps, **map_kwargs)

3.2.5 Arctic December (relevant to the Arctic shipping route assessment)#
One of the other CMIP6 quality assessments looks at projections of accessibility of Arctic shipping routes, where sea ice in December is particularly interesting. Hence we also plot some maps for this month here. The CMIP6 ensemble mean is quite close to the satellite data in the Arctic Ocean, with the main feature of the bias maps being the underestimation in the Hudson Bay. Note that not all models that provide sea ice concentration in the historical experiment also provide it in the different projection experiments, so that adds some additional uncertainty to these conclusions.
Show code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "December"}, **map_kwargs)

Show code cell source
compare_sic_maps(region="Arctic", sel_dict={"month": "December"},
plot_func=make_sic_bias_maps, **map_kwargs)

ℹ️ If you want to know more#
Key resources#
Introductory sea ice materials:
Introductory CMIP6 materials:
Code libraries used:
C3S EQC custom functions developed by B-Open
References#
Davy, R., and S. Outten (2020). The Arctic Surface Climate in CMIP6: Status and Developments since CMIP5. J. Climate, 33, 8047–8068, https://doi.org/10.1175/JCLI-D-19-0990.1.
Frankignoul, C., L. Raillard, B. Ferster, and Y. Kwon (2024). Arctic September sea ice concentration biases in CMIP6 models and their relationships with other model variables. J. Climate, https://doi.org/10.1175/JCLI-D-23-0452.1, in press.
Goessling, H. F., S. Tietsche, J. J. Day, E. Hawkins, and T. Jung (2016). Predictability of the Arctic sea ice edge, Geophys. Res. Lett., 43, 1642–1650, https://doi.org/10.1002/2015GL067232.
Henke, M., F. Cassalho, T. Miesse, C. M. Ferreira, J. Zhang and T. M. Ravens (2023). Assessment of Arctic sea ice and surface climate conditions in nine CMIP6 climate models. Arctic, Antarctic, and Alpine Research, 55(1). https://doi.org/10.1080/15230430.2023.2271592
Nie, Y., X. Lin, Q. Yang, J. Liu, D. Chen and P. Uotila (2023). Differences between the CMIP5 and CMIP6 Antarctic sea ice concentration budgets. Geophysical Research Letters, 50, e2023GL105265. https://doi.org/10.1029/2023GL105265
Li, S., Y. Zhang, C. Chen, Y. Zhang, D. Xu, and S. Hu (2023). Assessment of Antarctic Sea Ice Cover in CMIP6 Prediction with Comparison to AMSR2 during 2015–2021. Remote Sensing 15(8): 2048. https://doi.org/10.3390/rs15082048.
Roach, L. A., J. Dörr, C. R. Holmes, F. Massonnet, E. W. Blockley, D. Notz, T. Rackow, M. N. Raphael, S. P. O’Farrell, D. A. Bailey, and C. M. Bitz (2020). Antarctic sea ice area in CMIP6. Geophysical Research Letters, 47, e2019GL086729. https://doi.org/10.1029/2019GL086729
Shu, Q., Q. Wang, Z. Song, F. Qiao, J. Zhao, M. Chu and X. Li (2020). Assessment of sea ice extent in CMIP6 with comparison to observations and CMIP5. Geophysical Research Letters, 47, e2020GL087965, https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2020GL087965
SIMIP Community (2020). Arctic sea ice in CMIP6. Geophysical Research Letters, 47, e2019GL086749. https://doi.org/10.1029/2019GL086749
Watts, M., W. Maslowski, Y. J. Lee, J. C. Kinney, and R. Osinski (2021). A Spatial Evaluation of Arctic Sea Ice and Regional Limitations in CMIP6 Historical Simulations. J. Climate, 34, 6399–6420, https://doi.org/10.1175/JCLI-D-20-0491.1.