7.3. Consistency between the dataset underpinning the Copernicus Interactive Climate Atlas and its origins: Multiple origin datasets#
Please note that this repository is used for development and review, so quality assessments should be considered work in progress until they are merged into the main branch.
Production date: 2025-10-14.
Dataset version: 2.0.
Produced by: C3S2_521 contract.
🌍 Use case: Retrieving climate indicators from the Copernicus Interactive Climate Atlas#
❓ Quality assessment question#
Are the climate indicators in the dataset underpinning the Copernicus Interactive Climate Atlas consistent with their origin datasets?
Can the dataset underpinning the Copernicus Interactive Climate Atlas be reproduced from its origin datasets?
The Copernicus Interactive Climate Atlas, or Atlas for short, is a C3S web application providing an easy-to-access tool for exploring climate projections, reanalyses, and observational data [Guti24]. Version 2.0 of the application allows the user to interact with 12 datasets:
Type |
Dataset |
---|---|
Climate Projection |
CMIP6 |
Climate Projection |
CMIP5 |
Climate Projection |
CORDEX-CORE |
Climate Projection |
CORDEX-EUR-11 |
Reanalysis |
ERA5 |
Reanalysis |
ERA5-Land |
Reanalysis |
ORAS5 |
Reanalysis |
CERRA |
Observations |
E-OBS |
Observations |
BERKEARTH |
Observations |
CPC |
Observations |
SST-CCI |
These datasets are provided through an intermediary dataset, the Gridded dataset underpinning the Copernicus Interactive Climate Atlas or Atlas dataset for short [AtlasData]. Compared to their origins, the versions of the climate datasets within the Atlas dataset have been processed following the workflow in Figure 7.3.1.

Fig. 7.3.1 Schematic representation of the workflow for the production of the Atlas dataset from its origin datasets, from the User-tools for the C3S Atlas.#
Because a wide range of users interact with climate data through the Atlas application, it is crucial that the underpinning dataset represent its origins correctly. In other words, the Atlas dataset must be consistent with and reproducible from its origins. Here, we assess this consistency and reproducibility by comparing climate indicators retrieved from the Atlas dataset with their equivalents calculated from the origin dataset, mirroring the workflow from Figure 7.3.1. While a full analysis and reproduction of every record within the Atlas dataset is outside the scope of quality assessment (and would require high-performance computing infrastructure), a case study with a narrower scope probes these quality attributes of the dataset and can be a jumping-off point for further analysis by the reader.
This notebook is part of a series:
Notebook |
Contents |
---|---|
Comparison between Atlas dataset and one origin dataset (CMIP6) for one indicator ( |
|
Comparison between Atlas dataset and one origin dataset (CMIP6) for multiple indicators. |
|
Consistency between the dataset underpinning the Copernicus Interactive Climate Atlas and its origins: Multiple origin datasets |
Comparison between Atlas dataset and multiple origin datasets for one indicator. |
📢 Quality assessment statement#
These are the key outcomes of this assessment
Values of climate indicators (here 3 monthly indicators) provided by the Gridded dataset underpinning the Copernicus Interactive Climate Atlas are highly consistent with values calculated from its origin datasets (here 10 different datasets). The general distribution of indicator values is the same across time and space. However, small differences exist due to the difference in grid, as well as large differences in coverage in specific cases (e.g. sea surface temperature in polar regions). Users of the Atlas dataset – and thus users of the Atlas application – should be aware that values retrieved from Atlas may differ from a manual analysis of the origin dataset.
The Atlas dataset is highly reproducible from its origins. The indicator values provided by the Atlas dataset are identical or very close to those resulting from a manual reprocessing of its origin datasets. Out of the indicator-origin pairs tested here, only SST-CCI sea surface temperature consistently shows differences, albeit minor ones, between Atlas and the origin. Furthermore, there are some differences in coverage (e.g. E-OBS). These differences should not affect most users of the Atlas dataset or application in most use cases. For further analysis, it is recommended to manually process the origin dataset.
📋 Methodology#
This quality assessment tests the consistency between climate indicators retrieved from the Gridded dataset underpinning the Copernicus Interactive Climate Atlas [AtlasData] and their equivalents calculated from the origin datasets, as well as the reproducibility of said dataset.
This notebook probes the consistency between the Atlas dataset and multiple origin datasets at the same time. Due to differences in scope (e.g. atmosphere / land / sea), not every indicator is available in every origin dataset or its Atlas derivative. Furthermore, some origin datasets are historical while others are future projections. For this reason, we will examine the following indicators in the following origin datasets:
Monthly count of days with maximum near-surface (2-metre) air temperature above 35 °C (tx35
)
Type |
Dataset |
---|---|
Climate Projection |
CMIP6 |
Climate Projection |
CMIP5 |
Climate Projection |
CORDEX-EUR-11 |
Reanalysis |
ERA5 |
Reanalysis |
ERA5-Land |
Observations |
E-OBS |
Observations |
BERKEARTH |
Note that CORDEX-CORE has been left out of this assessment because its mosaicking workflow is out of scope. CERRA has been left out because the C3S User-tools package is currently not fully compatible with this dataset.
Monthly mean temperature of sea water near the surface (sst
)
Type |
Dataset |
---|---|
Reanalysis |
ORAS5 |
Observations |
SST-CCI |
Monthly count of days with daily accumulated precipitation of liquid water equivalent from all phases above 1 mm (r01
)
Type |
Dataset |
---|---|
Observations |
CPC |
The analysis and results are organised in the following steps, which are detailed in the sections below:
Install User-tools for the C3S Atlas.
Import all required libraries.
Definition of helper functions.
2. Calculate and retrieve indicators
Download data from the origin datasets.
Homogenise data.
Calculate indicators.
Interpolate to a common and regular grid.
Download corresponding data from the Atlas dataset.
Consistency: Compare the Atlas and reproduced datasets on native grids.
Reproducibility: Compare the Atlas and reproduced datasets on the Atlas grid.
📈 Analysis and results#
1. Code setup#
Note
This notebook uses earthkit for downloading (earthkit-data) and visualising (earthkit-plots) data. Because earthkit is in active development, some functionality may change after this notebook is published. If any part of the code stops functioning, please raise an issue on our GitHub repository so it can be fixed.
Install the User-tools for the C3S Atlas#
This notebook uses the User-tools for the C3S Atlas, which can be installed from GitHub using pip
.
For convenience, the following cell can do this from within the notebook.
Further details and alternative options for installing this library are available in its documentation.
Import required libraries#
In this section, we import all the relevant packages needed for running the notebook.
Define indicators#
Helper functions#
General#
Downloading data#
Data (pre-)processing#
Statistics#
Visualisation#
2. Calculate and retrieve indicators#
In the previous two notebooks in this assessment, the origin data were downloaded, pre-processed, used to calculate the relevant indicator(s), and interpolated; after which the Atlas dataset was downloaded. This notebook follows the same structure but for each origin in turn, for clarity and to preserve memory when loading multiple datasets at the same time. As such, the individual steps are described in less detail, because this information is available in the previous notebooks.
If you are only interested in specific origin datasets, or want to limit your bandwidth or memory usage, you can choose to only run specific subsections below.
This notebook uses earthkit-data to download files from the CDS. If you intend to run this notebook multiple times, it is highly recommended that you enable caching to prevent having to download the same files multiple times.
Note
This notebook uses xESMF for regridding data. xESMF is most easily installed using mamba/conda as explained in its documentation. Users who cannot or do not wish to use mamba/conda can manually compile and install ESMF on their machines. In future, this notebook will use earthkit-regrid instead, once it reaches suitable maturity.
Note that the Atlas workflow calculates indicators first, then regrids. For operations that involve averaging, like smoothing and regridding, the order of operations can affect the result, especially in areas with steep gradients [Bur20]. Examples of such areas for a temperature index are coastlines and mountain ranges. In the case of Atlas, this order of operations was a conscious choice to preserve the “raw” signals, e.g. preventing extreme temperatures from being smoothed out. However, it can affect the indicator values and therefore must be considered when using the Atlas application or dataset.
General setup#
Throughout this section, we combine the downloaded datasets into dictionaries for easy access. They cannot be combined into a single xarray object because of differing grids. Each dataset is added in its own subsection, meaning any datasets not downloaded will automatically be skipped in the analysis.
CMIP6#
CMIP5#
CORDEX-EUR-11#
ERA5#
Note that ERA5 data for a year are >30 GB in size. These data may take up to several hours to download and require sufficient storage to download and cache.
ERA5-Land#
Note that ERA5-Land data for a year are >200 GB in size. These data may take up to several hours to download and require sufficient storage to download and cache.
E-OBS#
Note that E-OBS data for the full period are >30 GB in size. These data may take up to several hours to download and require sufficient storage to download and cache.
BERKEARTH#
ORAS5#
SST-CCI#
Note that SST-CCI data for a year are >50 GB in size. These data may take up to several hours to download and require sufficient storage to download and cache.
CPC#
Cleanup#
Lastly, we manually clear out some memory-intensive objects that are no longer necessary.
We also re-chunk the datasets to be more computationally efficient:
3. Results#
This section contains the comparison between the indicator values retrieved from the Atlas dataset vs those reproduced from the origin datasets.
The datasets are first compared on their native grids. This means a point-by-point comparison is not possible (because the points are not equivalent), but the distributions can be compared geospatially and overall. This comparison probes the consistency quality attribute: Are the climate indicators in the dataset underpinning the Copernicus Interactive Climate Atlas consistent with their origin datasets?
The second comparison uses the regridded version of the indicators derived from the origin datasets. This makes a point-by-point comparison possible. This second comparison probes how well the dataset underpinning the Copernicus Interactive Climate Atlas can be reproduced from its origin datasets, based on the workflow (Figure 7.3.1).
Consistency: Comparison on native grids#
For the geospatial comparison, we display the values of the indicators for one month, across one region and globally. In the first example, we display the results across Europe in June, which should provide significant spatial variation.
This region can easily be modified in the following code cell using the domains provided by earthkit-plots.
Some examples are provided in the cell (commented out using #
).
As in the
previous
notebooks,
it is clear from the geospatial comparisons
(Figures 7.3.2–7.3.5)
that the Atlas dataset closely resembles a manual reproduction of its origin datasets.
The general distribution of indicator values is the same for
all comparisons.
Clear differences only show up in the comparisons of
E-OBS tx35
(Figure 7.3.3)
and
SST-CCI sst
(Figure 7.3.4).
For E-OBS,
the difference is primarily one of coverage,
which may be explained by
a filter being applied
or
a difference in the version of E-OBS used.
For SST-CCI,
small quantitative differences are apparent,
particularly in the central and eastern Mediterranean Sea.
Their cause is less clear.
Lastly,
the CMIP5 comparison
(Figure 7.3.2)
shows clear differences here due to the fact that the Atlas version of CMIP5 is regridded to a coarser resolution
to ensure consistency between the different model members of the ensemble.
This pattern is also visible in the overall distributions
(Figures 7.3.6–7.3.9),
which are again very similar for almost all comparisons.
A major difference appears in the SST-CCI sst
comparison,
where the manual reproduction contains considerably fewer values below 0.
A geospatial comparison on a global grid
(not shown here)
reveals that this difference is explained by masking of values in the Arctic and Antarctic.
This is likely caused by differences in the exact workflow used
and hints at an explanation for the small quantitative differences seen in the
European
geospatial comparison.
The overall conclusion from this comparison is the same as in the previous notebooks. The Atlas dataset and its origins are highly consistent, but small differences exist due to the difference in grid and potential differences in the origin dataset version used and workflow. Large differences were observed only in the availability or masking of data in specific areas, such as on the edges of the E-OBS dataset and in the polar regions for SST-CCI. Users of the Atlas dataset – and thus users of the Atlas application – should be aware that the indicator values retrieved for a specific location may differ slightly from a manual analysis of the origin dataset.

Fig. 7.3.2 Comparison between Atlas dataset and reproduction for
projected
tx35
in one month,
across Europe,
on the native grid of each dataset.#

Fig. 7.3.3 Comparison between Atlas dataset and reproduction for
historical
tx35
in one month,
across Europe,
on the native grid of each dataset.#

Fig. 7.3.4 Comparison between Atlas dataset and reproduction for
historical
sst
in one month,
across Europe,
on the native grid of each dataset.#

Fig. 7.3.5 Comparison between Atlas dataset and reproduction for
historical
r01
in one month,
across Europe,
on the native grid of each dataset.#

Fig. 7.3.6 Comparison between overall distributions of
projected
tx35
values in the Atlas dataset and its reproduction,
across all spatial and temporal dimensions,
on the native grid of each dataset.#

Fig. 7.3.7 Comparison between overall distributions of
historical
tx35
values in the Atlas dataset and its reproduction,
across all spatial and temporal dimensions,
on the native grid of each dataset.#

Fig. 7.3.8 Comparison between overall distributions of
historical
sst
values in the Atlas dataset and its reproduction,
across all spatial and temporal dimensions,
on the native grid of each dataset.#

Fig. 7.3.9 Comparison between overall distributions of
historical
r01
values in the Atlas dataset and its reproduction,
across all spatial and temporal dimensions,
on the native grid of each dataset.#
Reproducibility: Comparison on Atlas grid#
After regridding/interpolating, the indicator values reproduced from the origin dataset can be compared point-by-point to the values retrieved from the Atlas dataset. We first examine some metrics that describe the difference Δ between corresponding pixels:
Mean Δ | Median Δ | Median |Δ| | % where |Δ| ≥ ε | Pearson r | |
---|---|---|---|---|---|
CMIP6 | -0.00095 | 0.00000 | 0.00000 | 0.09388 | 0.99998 |
CMIP5 | 0.00000 | 0.00000 | 0.00000 | 10.20628 | 1.00000 |
CORDEX-EUR-11 | -0.00065 | 0.00000 | 0.00000 | 4.20466 | 1.00000 |
Mean Δ | Median Δ | Median |Δ| | % where |Δ| ≥ ε | Pearson r | |
---|---|---|---|---|---|
ERA5 | 0.00003 | 0.00000 | 0.00000 | 0.00311 | 1.00000 |
ERA5-Land | -0.00002 | 0.00000 | 0.00000 | 0.00597 | 1.00000 |
E-OBS | -0.11389 | 0.00000 | 0.00000 | 3.63519 | 0.91907 |
BERKEARTH | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 1.00000 |
Mean Δ | Median Δ | Median |Δ| | % where |Δ| ≥ ε | Pearson r | |
---|---|---|---|---|---|
ORAS5 | 0.00000 | 0.00000 | 0.00024 | 65.47403 | 1.00000 |
SST-CCI | -0.04128 | -0.03296 | 0.13577 | 53.97652 | 0.99932 |
Mean Δ | Median Δ | Median |Δ| | % where |Δ| ≥ ε | Pearson r | |
---|---|---|---|---|---|
CPC | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 1.00000 |
As in the previous notebooks, it is clear that the Atlas dataset and its manual reproduction are very similar. The median difference, median absolute difference, and median absolute percentage difference are all close to 0 and the vast majority of pixels show a near-zero difference (defined here as |Δ| ≥ ε with ε = \(10^{-5}\) to avoid floating-point errors) in all comparisons. SST-CCI shows the most significant differences, as in the previous section.
These observations are confirmed by
the overall distributions
(Figures 7.3.10–7.3.13)
and
the geospatial distributions
(Figures 7.3.14–7.3.17).
Notably,
while the difference between tx35
in the Atlas dataset and E-OBS is typically small,
with a median of 0 and with ≤4% of pixels showing non-zero differences,
it has a long tail of large differences
(Figure 7.3.11).
These pixels are concentrated at the periphery of the domain
(Figures 7.3.15).
This may be the result of a difference in
the underlying dataset version
or
the regridding process.
We can extend the conclusion from the
previous
notebooks,
namely that the Atlas dataset can be considered practically reproducible.
Some indicators in the Atlas dataset
are completely identical to
their origin datasets,
e.g. r01
in the CPC comparison;
others
show non-zero differences,
e.g. some of the pixels in the CMIP6 and E-OBS tx35
comparisons.
The causes of these differences are unclear,
but they are generally small and rare enough to be negligible for most users using the Atlas dataset,
especially since they will typically use it through the application.
For further analysis,
it is generally best to manually process the origin dataset,
if only to remove the amount of steps that may affect the result.

Fig. 7.3.10 Comparison between overall distributions of
projected
tx35
values in the Atlas dataset and its reproduction
on the Atlas grid,
across all spatial and temporal dimensions,
including the per-pixel difference.#

Fig. 7.3.11 Comparison between overall distributions of
historical
tx35
values in the Atlas dataset and its reproduction
on the Atlas grid,
across all spatial and temporal dimensions,
including the per-pixel difference.#

Fig. 7.3.12 Comparison between overall distributions of
historical
sst
values in the Atlas dataset and its reproduction
on the Atlas grid,
across all spatial and temporal dimensions,
including the per-pixel difference.#

Fig. 7.3.13 Comparison between overall distributions of
historical
r01
values in the Atlas dataset and its reproduction
on the Atlas grid,
across all spatial and temporal dimensions,
including the per-pixel difference.#

Fig. 7.3.14 Comparison between Atlas dataset and reproduction for
projected
tx35
in one month,
across Europe,
on the Atlas dataset grid,
including the per-pixel difference.#

Fig. 7.3.15 Comparison between Atlas dataset and reproduction for
historical
tx35
in one month,
across Europe,
on the Atlas dataset grid,
including the per-pixel difference.#

Fig. 7.3.16 Comparison between Atlas dataset and reproduction for
historical
sst
in one month,
across Europe,
on the Atlas dataset grid,
including the per-pixel difference.#

Fig. 7.3.17 Comparison between Atlas dataset and reproduction for
historical
r01
in one month,
across Europe,
on the Atlas dataset grid,
including the per-pixel difference.#
ℹ️ If you want to know more#
Key resources#
The CDS catalogue entries for the data used were:
Gridded dataset underpinning the Copernicus Interactive Climate Atlas: multi-origin-c3s-atlas
CMIP6 climate projections: projections-cmip6
CMIP5 daily data on single levels: projections-cmip5-daily-single-levels
CORDEX regional climate model data on single levels: projections-cordex-domains-single-levels
ERA5 hourly data on single levels from 1940 to present: reanalysis-era5-single-levels
ERA5-Land hourly data from 1950 to present: reanalysis-era5-land
E-OBS daily gridded meteorological data for Europe from 1950 to present derived from in-situ observations: insitu-gridded-observations-europe
Temperature and precipitation gridded data for global and regional domains derived from in-situ and satellite observations (BERKEARTH, CPC) insitu-gridded-observations-global-and-regional
ORAS5 global ocean reanalysis monthly data from 1958 to present: reanalysis-oras5
Sea surface temperature daily gridded data from 1981 to 2016 derived from a multi-product satellite-based ensemble (SST-CCI): satellite-sea-surface-temperature-ensemble-product
Code libraries used:
xclim climate indicator tools
More about the Copernicus Interactive Climate Atlas and its IPCC predecessor:
References#
To be replaced with numerical references once the text is finished
[Guti24] J. M. Gutiérrez et al., ‘The Copernicus Interactive Climate Atlas: a tool to explore regional climate change’, ECMWF Newsletter, vol. 181, pp. 38–45, Oct. 2024, doi: 10.21957/ah52ufc369.
[AtlasData] Copernicus Climate Change Service, ‘Gridded dataset underpinning the Copernicus Interactive Climate Atlas’. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), Jun. 17, 2024. doi: 10.24381/cds.h35hb680.
[CMIP6data] Copernicus Climate Change Service, ‘CMIP6 climate projections’. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), Mar. 23, 2021. doi: 10.24381/cds.c866074c.
[Bur20] O. Burggraaff, ‘Biases from incorrect reflectance convolution’, Optics Express, vol. 28, no. 9, pp. 13801–13816, Apr. 2020, doi: 10.1364/OE.391470.