{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "![logo](./img/LogoLine_horizon_C3S.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Plot an Ensemble of CMIP6 Climate Projections" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### About" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook provides a practical introduction on how to access and process [CMIP6 global climate projections](https://cds.climate.copernicus.eu/cdsapp#!/dataset/projections-cmip6?tab=overview) data available in the Climate Data Store (CDS) of the Copernicus Climate Change Service (C3S). The workflow shows how to compute and visualize the output of an ensemble of models for the annual global average temperature between 1850 to 2100. You will use the `historical` experiment for the temporal period 1850 to 2014 and the three scenarios `SSP1-2.6`, `SSP2-4.5` and `SSP5-8.5` for the period from 2015 to 2100.\n", "\n", "For the sake of simplicity, and to facilitate data download, the tutorial will make use of some of the coarser resolution models that have a smaller data size. It is nevertheless only a choice for this exercise and not a recommendation (since ideally all models, including those with highest resolution, should be used). Many more models are available on the CDS, and when calculating an ensemble of models, it is best practice to use as many as possible for a more reliable output. See [here](https://confluence.ecmwf.int/display/CKB/CMIP6%3A+Global+climate+projections#CMIP6:Globalclimateprojections-Models,gridsandpressurelevels) a full list of models included in the CDS-CMIP6 dataset.\n", "\n", "Learn [here](https://confluence.ecmwf.int/display/CKB/CMIP6%3A+Global+climate+projections#CMIP6:Globalclimateprojections) more about CMIP6 global climate projections and the CMIP6 experiments in the CDS." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "The notebook has the following outline:\n", "\n", "1. Request data from the CDS programmatically with the CDS API\n", "2. Unzip the downloaded data files\n", "3. Load and prepare CMIP6 data for one model and one experiment\n", "4. Load and prepare CMIP6 data for all models and experiments\n", "5. Visualize CMIP6 annual global average temperature between 1850 to 2100" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook introduces you to [CMIP6 Global climate projections](https://cds.climate.copernicus.eu/cdsapp#!/dataset/projections-cmip6?tab=overview). The datasets used in the notebook have the following specifications:\n", "\n", "> **Data**: CMIP6 global climate projections of near-surface air temperature
\n", "> **Experiments**: Historical, SSP1-2.6, SSP2-4.5, SSP5-8.5
\n", "> **Models**: 7 models from Germany, France, UK, Japan and Russia
\n", "> **Temporal range**: Historical: 1850 - 2014. Scenarios: 2015 - 2100
\n", "> **Spatial coverage**: Global
\n", "> **Format**: NetCDF, compressed into zip files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
Run the tutorial via free cloud platforms: \n", " \"Binder\"\n", " \"Kaggle\"\n", " \"Colab\"
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Install CDS API via pip" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install cdsapi" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load libraries" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# General libs for file paths, data extraction, etc\n", "from glob import glob\n", "from pathlib import Path\n", "from os.path import basename\n", "import zipfile # To extract zipfiles\n", "import urllib3 \n", "urllib3.disable_warnings() # Disable warnings for data download via API\n", "\n", "# CDS API\n", "import cdsapi\n", "\n", "# Libraries for working with multi-dimensional arrays\n", "import numpy as np\n", "import xarray as xr\n", "import pandas as pd\n", "\n", "# Libraries for plotting and visualising data\n", "import matplotlib.path as mpath\n", "import matplotlib.pyplot as plt\n", "import cartopy.crs as ccrs\n", "from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER\n", "import cartopy.feature as cfeature" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Request data from the CDS programmatically with the CDS API" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will request data from the Climate Data Store (CDS) programmatically with the help of the CDS API. Let us make use of the option to manually set the CDS API credentials. First, you have to define two variables: `URL` and `KEY` which build together your CDS API key. The string of characters that make up your KEY include your personal User ID and CDS API key. To obtain these, first register or login to the CDS (http://cds.climate.copernicus.eu), then visit https://cds.climate.copernicus.eu/api-how-to and copy the string of characters listed after \"key:\". Replace the `#########` below with this string." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "URL = 'https://cds.climate.copernicus.eu/api/v2'\n", "KEY = '##################################'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we specify a data directory in which we will download our data and all output files that we will generate:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "DATADIR = './'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next step is then to request the data with the help of the CDS API. Below, we loop through multiple data requests. These include data for different models and scenarios. It is not possible to specify multiple models in one data request as their spatial resolution varies.\n", "\n", "We will download monthly aggregated data. These are disseminated as netcdf files within a zip archive.\n", "\n", "In order to loop through the various experiments and models in our data requests, we will specify them as Python 'lists' here:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "experiments = ['historical', 'ssp126', 'ssp245', 'ssp585']" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "models = ['hadgem3_gc31_ll', 'inm_cm5_0', 'inm_cm4_8', 'ipsl_cm6a_lr', \n", " 'miroc_es2l', 'mpi_esm1_2_lr', 'ukesm1_0_ll']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> **Note:** Note that these are a selection of the lightest models (in terms of data volume), to facilitate download for the sake of this exercise. There are many [more models available on the CDS](https://cds.climate.copernicus.eu/cdsapp#!/dataset/projections-cmip6?tab=overview)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can download the data for each model and experiment sequentially. We will do this separately for the historical experiments and for the various future scenarios, given that they refer to two different time periods.\n", "\n", "Before you run the cells below, the terms and conditions on the use of the data need to have been accepted in the CDS. You can view and accept these conditions by logging into the [CDS](http://cds.climate.copernicus.eu), searching for the dataset, then scrolling to the end of the `Download data` section." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> **Note:** For more information about data access through the Climate Data Store, please see the dedicated tutorial [here](./climate_data_store_intro.ipynb)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# DOWNLOAD DATA FOR HISTORICAL PERIOD\n", "\n", "c = cdsapi.Client(url=URL, key=KEY)\n", "\n", "for j in models:\n", " c.retrieve(\n", " 'projections-cmip6',\n", " {\n", " 'format': 'zip',\n", " 'temporal_resolution': 'monthly',\n", " 'experiment': 'historical',\n", " 'level': 'single_levels',\n", " 'variable': 'near_surface_air_temperature',\n", " 'model': f'{j}',\n", " 'date': '1850-01-01/2014-12-31',\n", " },\n", " f'{DATADIR}cmip6_monthly_1850-2014_historical_{j}.zip')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# DOWNLOAD DATA FOR FUTURE SCENARIOS\n", "\n", "c = cdsapi.Client(url=URL, key=KEY)\n", "\n", "for i in experiments[1:]:\n", " for j in models:\n", " c.retrieve(\n", " 'projections-cmip6',\n", " {\n", " 'format': 'zip',\n", " 'temporal_resolution': 'monthly',\n", " 'experiment': f'{i}',\n", " 'level': 'single_levels',\n", " 'variable': 'near_surface_air_temperature',\n", " 'model': f'{j}',\n", " 'date': '2015-01-01/2100-12-31',\n", " },\n", " f'{DATADIR}cmip6_monthly_2015-2100_{i}_{j}.zip')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Unzip the downloaded data files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From the CDS, CMIP6 data are available as `NetCDF` files compressed into `zip` archives. For this reason, before we can load any data, we have to extract the files. Having downloaded the four experiments `historical`, `SSP1-2.6`, `SSP2-4.5` and `SSP5-8.5` as seperate zip files, we can use the functions from the `zipfile` Python package to extract their contents. For each zip file we first construct a `ZipFile()` object, then we apply the function `extractall()` to extract its content." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "cmip6_zip_paths = glob(f'{DATADIR}*.zip')\n", "for j in cmip6_zip_paths:\n", " with zipfile.ZipFile(j, 'r') as zip_ref:\n", " zip_ref.extractall(f'{DATADIR}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Create a list of the extracted files\n", "\n", "To facilitate batch processing later in the tutorial, here we create a list of the extracted NetCDF files:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "cmip6_nc = list()\n", "cmip6_nc_rel = glob(f'{DATADIR}tas*.nc')\n", "for i in cmip6_nc_rel:\n", " cmip6_nc.append(os.path.basename(i))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will briefly inspect this list by printing the first five elements, corresponding to the filenames of a sample of the extracted NetCDF files:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['tas_Amon_HadGEM3-GC31-LL_historical_r1i1p1f3_gn_18500216-20141216_v20190624.nc',\n", " 'tas_Amon_HadGEM3-GC31-LL_ssp126_r1i1p1f3_gn_201501-204912_v20200114.nc',\n", " 'tas_Amon_HadGEM3-GC31-LL_ssp126_r1i1p1f3_gn_205001-210012_v20200114.nc',\n", " 'tas_Amon_HadGEM3-GC31-LL_ssp245_r1i1p1f3_gn_201501-204912_v20190908.nc',\n", " 'tas_Amon_HadGEM3-GC31-LL_ssp245_r1i1p1f3_gn_205001-210012_v20190908.nc']" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cmip6_nc[0:5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load and prepare CMIP6 data for one model and one experiment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have downloaded and extracted the data, we can prepare it in order to view a time series of the spread of annual global temperature for the model ensemble. These preparation steps include the following:\n", "\n", "1. **Spatial aggregation**: to have a single global temperature value for each model/experiment dataset, and for each time step\n", "2. **Temporal aggregation**: from monthly to yearly\n", "3. **Conversion of temperature units** from degrees Kelvin to Celsius\n", "4. **Addition of data dimensions** in preparation for the merging of datasets from different models and experiments\n", "\n", "In this section we apply these steps to a single dataset from one model and one experiment. In the next section we merge data from all models/experiments in preparation for the final processing and plotting of the temperature time series." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Load and inspect data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We begin by loading the first of the NetCDF files in our list. We will use the Python library [xarray](http://xarray.pydata.org/en/stable/) and its function `open_dataset` to read NetCDF files.\n", "\n", "The result is a `xarray.Dataset` object with four dimensions: `bnds`, `lat`, `lon`, `time`, of which the dimension `bnds` is not callable." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:    (time: 1979, bnds: 2, lat: 144, lon: 192)\n",
       "Coordinates:\n",
       "  * time       (time) object 1850-02-16 00:00:00 ... 2014-12-16 00:00:00\n",
       "  * lat        (lat) float64 -89.38 -88.12 -86.88 -85.62 ... 86.88 88.12 89.38\n",
       "  * lon        (lon) float64 0.9375 2.812 4.688 6.562 ... 355.3 357.2 359.1\n",
       "    height     float64 1.5\n",
       "Dimensions without coordinates: bnds\n",
       "Data variables:\n",
       "    time_bnds  (time, bnds) object 1850-02-01 00:00:00 ... 2015-01-01 00:00:00\n",
       "    lat_bnds   (time, lat, bnds) float64 ...\n",
       "    lon_bnds   (time, lon, bnds) float64 ...\n",
       "    tas        (time, lat, lon) float32 ...\n",
       "Attributes: (12/46)\n",
       "    Conventions:            CF-1.7 CMIP-6.2\n",
       "    activity_id:            CMIP\n",
       "    branch_method:          standard\n",
       "    branch_time_in_child:   0.0\n",
       "    branch_time_in_parent:  0.0\n",
       "    creation_date:          2019-06-19T11:21:17Z\n",
       "    ...                     ...\n",
       "    title:                  HadGEM3-GC31-LL output prepared for CMIP6\n",
       "    variable_id:            tas\n",
       "    variant_label:          r1i1p1f3\n",
       "    license:                CMIP6 model data produced by the Met Office Hadle...\n",
       "    cmor_version:           3.4.0\n",
       "    tracking_id:            hdl:21.14100/b6959414-d5ed-4cd9-a627-59238e52132d
" ], "text/plain": [ "\n", "Dimensions: (time: 1979, bnds: 2, lat: 144, lon: 192)\n", "Coordinates:\n", " * time (time) object 1850-02-16 00:00:00 ... 2014-12-16 00:00:00\n", " * lat (lat) float64 -89.38 -88.12 -86.88 -85.62 ... 86.88 88.12 89.38\n", " * lon (lon) float64 0.9375 2.812 4.688 6.562 ... 355.3 357.2 359.1\n", " height float64 ...\n", "Dimensions without coordinates: bnds\n", "Data variables:\n", " time_bnds (time, bnds) object ...\n", " lat_bnds (time, lat, bnds) float64 ...\n", " lon_bnds (time, lon, bnds) float64 ...\n", " tas (time, lat, lon) float32 ...\n", "Attributes: (12/46)\n", " Conventions: CF-1.7 CMIP-6.2\n", " activity_id: CMIP\n", " branch_method: standard\n", " branch_time_in_child: 0.0\n", " branch_time_in_parent: 0.0\n", " creation_date: 2019-06-19T11:21:17Z\n", " ... ...\n", " title: HadGEM3-GC31-LL output prepared for CMIP6\n", " variable_id: tas\n", " variant_label: r1i1p1f3\n", " license: CMIP6 model data produced by the Met Office Hadle...\n", " cmor_version: 3.4.0\n", " tracking_id: hdl:21.14100/b6959414-d5ed-4cd9-a627-59238e52132d" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds = xr.open_dataset(f'{DATADIR}{cmip6_nc[0]}')\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By examining the data above, we can see from the temporal range (1850 to 2014) that it is from the `historical` experiment.\n", "\n", "We see that the data dimensions have been given labelled coordinates of time, latitude and longitude. We can find more about the dataset from the `Attributes`, such information includes the model name, description of the variable (`long_name`), units, etc.\n", "\n", "Some of this information we will need later, this includes the experiment and model IDs. We will save these into variables:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "exp = ds.attrs['experiment_id']\n", "mod = ds.attrs['source_id']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An `xarray.Dataset()` may contain arrays of multiple variables. We only have one variable in the dataset, which is near-surface air temperature, `tas`. Below we create an `xarray.DataArray()` object, which takes only one variable, but gives us more flexibility in processing." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "da = ds['tas']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Spatial aggregation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next step is to aggregate the temperature values spatially (i.e. average over the latitude and longitude dimensions) and compute the global monthly near-surface temperature.\n", "\n", "A very important consideration however is that the gridded data cells do not all correspond to the same areas. The size covered by each data point varies as a function of latitude. We need to take this into account when averaging. One way to do this is to use the cosine of the latitude as a proxy for the varying sizes. \n", "\n", "This can be implemented by first calculating weights as a function of the cosine of the latitude, then applying these weights to the data array with the xarray function `weighted()`:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "weights = np.cos(np.deg2rad(da.lat))\n", "weights.name = \"weights\"\n", "da_weighted = da.weighted(weights)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next step is then to compute the mean across the latitude and longitude dimensions of the weighted data array with the function `mean()`. The result is a DataArray with one dimension (`time`)." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "da_agg = da_weighted.mean(['lat', 'lon'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Temporal aggregation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now aggregate the monthly global near-surface air temperature values to annual global near-surface air temperature values. This operation can be done in two steps: first, all the values for one specific year have to be grouped with the function `groupby()` and second, we can create the average of each group with the function `mean()`.\n", "\n", "The result is a one-dimensional DataArray. Please note that this operation changes the name of the dimension from `time` to `year`." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "da_yr = da_agg.groupby('time.year').mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Conversion from Kelvin to Celsius" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The metadata of the original data (before it was stripped during the subsequent processing steps) tells us that the near-surface air temperature data values are in units of Kelvin. We will convert them to degrees Celsius by subtracting 273.15 from the data values. " ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "da_yr = da_yr - 273.15" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Create additional data dimensions (to later combine data from multiple models & experiments)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we will create additional dimensions for the model and for the experiment. These we will label with the model and experiment name as taken from the metadata of the original data (see above). These will be useful when we repeat the processes above for all models and experiments, and combine them into one array." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "da_yr = da_yr.assign_coords(model=mod)\n", "da_yr = da_yr.expand_dims('model')\n", "da_yr = da_yr.assign_coords(experiment=exp)\n", "da_yr = da_yr.expand_dims('experiment')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load and prepare CMIP6 data for all models and experiments" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To repeat the steps above for all models and all experiments, we will collect all of the commands we have used so far into a function, which we can then apply to a batch of files corresponding to the data from all models and experiments." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "# Function to aggregate in geographical lat lon dimensions\n", "\n", "def geog_agg(fn):\n", " ds = xr.open_dataset(f'{DATADIR}{fn}')\n", " exp = ds.attrs['experiment_id']\n", " mod = ds.attrs['source_id']\n", " da = ds['tas']\n", " weights = np.cos(np.deg2rad(da.lat))\n", " weights.name = \"weights\"\n", " da_weighted = da.weighted(weights)\n", " da_agg = da_weighted.mean(['lat', 'lon'])\n", " da_yr = da_agg.groupby('time.year').mean()\n", " da_yr = da_yr - 273.15\n", " da_yr = da_yr.assign_coords(model=mod)\n", " da_yr = da_yr.expand_dims('model')\n", " da_yr = da_yr.assign_coords(experiment=exp)\n", " da_yr = da_yr.expand_dims('experiment')\n", " da_yr.to_netcdf(path=f'{DATADIR}cmip6_agg_{exp}_{mod}_{str(da_yr.year[0].values)}.nc')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can apply this function to all the extracted NetCDF files. The `try` and `except` clauses ensure that all NetCDF files are attempted, even if some fail to be processed. One reason why some may fail is if the data are labelled differently, e.g. the model *MCM-UA-1-0* has coordinates labelled as \"*latitude*\" and *longitude*\". This differs from the suggested standard, and more commonly applied labels of \"*lat*\" and \"*lon*\". Any that fail will be recorded in a print statement, and these can be processed separately. See [here](https://confluence.ecmwf.int/display/CKB/CMIP6%3A+Global+climate+projections#CMIP6:Globalclimateprojections-QualitycontroloftheCDS-CMIP6subset) more details on the quality control of the CMIP6 datasets on the CDS." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "for i in cmip6_nc:\n", " try:\n", " geog_agg(i)\n", " except: print(f'{i} failed')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the absence of any print statements, we see that all files were successfully processed. \n", "\n", "We will now combine these processed files into one dataset for the final steps to create a visualisation of near-surface air temperature from the model ensemble." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If all files have the same coordinates, the function `xarray.open_mfdataset` will merge the data according to the same coordinates." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "data_ds = xr.open_mfdataset(f'{DATADIR}cmip6_agg*.nc')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The dataset created by `xarray.open_mfdataset` is by default in the form of \"lazy Dask arrays\". \n", "\n", "Dask divides arrays into many small pieces, called chunks, each of which is presumed to be small enough to fit into memory. As opposed to eager evaluation, operations on Dask arrays are lazy, i.e. operations queue up a series of tasks mapped over blocks, and no computation is performed until you request values to be computed. For more details, see https://xarray.pydata.org/en/stable/user-guide/dask.html. \n", "\n", "To facilitate further processing we would need to convert these Dask arrays into in-memory \"eager\" arrays, which we can do by using the `load()` method: " ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:     (year: 251, model: 7, experiment: 4)\n",
       "Coordinates:\n",
       "  * year        (year) int64 1850 1851 1852 1853 1854 ... 2097 2098 2099 2100\n",
       "    height      (model) float64 1.5 2.0 2.0 2.0 2.0 2.0 1.5\n",
       "  * model       (model) object 'HadGEM3-GC31-LL' 'INM-CM4-8' ... 'UKESM1-0-LL'\n",
       "  * experiment  (experiment) object 'historical' 'ssp126' 'ssp245' 'ssp585'\n",
       "Data variables:\n",
       "    tas         (experiment, model, year) float64 13.75 13.62 ... 20.48 20.63
" ], "text/plain": [ "\n", "Dimensions: (year: 251, model: 7, experiment: 4)\n", "Coordinates:\n", " * year (year) int64 1850 1851 1852 1853 1854 ... 2097 2098 2099 2100\n", " height (model) float64 1.5 2.0 2.0 2.0 2.0 2.0 1.5\n", " * model (model) object 'HadGEM3-GC31-LL' 'INM-CM4-8' ... 'UKESM1-0-LL'\n", " * experiment (experiment) object 'historical' 'ssp126' 'ssp245' 'ssp585'\n", "Data variables:\n", " tas (experiment, model, year) float64 13.75 13.62 ... 20.48 20.63" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_ds.load()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we create an Xarray DataArray object for the near-surface air temperature variable, 'tas':" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "data = data_ds['tas']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize the CMIP6 annual global average temperature between 1850 to 2100" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will now create a plot of the model ensemble of near-surface air temperature for the historical and future periods, according to the three selected scenarios." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Calculate quantiles for model ensemble" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Rather than plotting the data from all models, we will instead view the range of values as given by quantiles, including the 10th (near to lower limit), the 50th (mid-range) and the 90th (near to upper limit) quantiles:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_90 = data.quantile(0.9, dim='model')\n", "data_10 = data.quantile(0.1, dim='model')\n", "data_50 = data.quantile(0.5, dim='model')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> **Note:** The warning message is due to the presence of NaN (Not a Number) data given that the historical and scenario datasets represent only parts (historical and future respectively) of the entire time series. As these two datasets have been merged, NaN values will exist (e.g. there will be no data for the historical experiment for the future period)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### View time-series" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally we will visualise this data in one time-series plot. We will use the matplotlib function `plot()`. The dimension `year` will be the x-axis and the near-surface air temperature values in degrees Celsius will be the y-axis. \n", "\n", "The plotting function below has four main parts:\n", "* **Initiate the plot**: initiate a matplotlib plot with `plt.subplots()`\n", "* **Plot the time-series**: plot the data for each experiment, including the historical experiment and three scenarios with the `plot()` function\n", "* **Set axes limits, labels, title and legend**: Define title and axes labels, and add additional items to the plot, such as legend or gridlines\n", "* **Save the figure**: Save the figure as a PNG file with the `matplotlib.pyplot.savefig()` function" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(1, 1, figsize = (16, 8))\n", "\n", "colours = ['black','red','green','blue']\n", "for i in np.arange(len(experiments)):\n", " ax.plot(data_50.year, data_50[i,:], color=f'{colours[i]}', \n", " label=f'{data_50.experiment[i].values} 50th quantile')\n", " ax.fill_between(data_50.year, data_90[i,:], data_10[i,:], alpha=0.1, color=f'{colours[i]}', \n", " label=f'{data_50.experiment[i].values} 10th and 90th quantile range')\n", "\n", "ax.set_xlim(1850,2100)\n", "ax.set_title('CMIP6 annual global average temperature (1850 to 2100)')\n", "ax.set_ylabel('tam (Celsius)')\n", "ax.set_xlabel('year')\n", "handles, labels = ax.get_legend_handles_labels()\n", "ax.legend(handles, labels)\n", "ax.grid(linestyle='--')\n", "\n", "fig.savefig(f'{DATADIR}CMIP6_annual_global_tas.png')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The visualization of the `CMIP6 annual global average temperature (1850 to 2100)` above shows that the global average temperature was more or less stable in the pre-industrial phase, but steadily increases since the 1990s. It shows further that, depending on the SSP scenario, the course and increase of the global annual temperature differs. While for the best case `SSP1-2.6` scenario, the global annual temperature could stabilize around 15 degC, in the worst case `SSP5-8.5` scenario, the global annual temperature could increase to above 20 degC." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", "

This project is licensed under APACHE License 2.0. | View on GitHub" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" } }, "nbformat": 4, "nbformat_minor": 4 }