{ "cells": [ { "cell_type": "markdown", "id": "54885250", "metadata": {}, "source": [ "![logo](./img/LogoLine_horizon_CAMS.png)" ] }, { "cell_type": "markdown", "id": "713c67e6", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "id": "b18caf91", "metadata": {}, "source": [ "# Tutorial on how to import, subset, aggregate and export CAMS Data\n", "\n", "This tutorial provides practical examples that demonstrate how to download, read into Xarray, subset, aggregate and export data from the [Atmosphere Data Store (ADS)](https://ads.atmosphere.copernicus.eu/) of the [Copernicus Atmosphere Monitoring Service (CAMS)](https://atmosphere.copernicus.eu/)." ] }, { "cell_type": "markdown", "id": "45228b07", "metadata": {}, "source": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
Run the tutorial via free cloud platforms: \n", " \"Binder\"\n", " \"Kaggle\"\n", " \"Colab\"
" ] }, { "cell_type": "markdown", "id": "f375f8f6", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "id": "7a279bed", "metadata": {}, "source": [ "## Install ADS API\n", "\n", "We will need to install the Application Programming Interface (API) of the [Atmosphere Data Store (ADS)](https://ads-beta.atmosphere.copernicus.eu/). This will allow us to programmatically download data." ] }, { "cell_type": "markdown", "id": "a3efd2aa", "metadata": {}, "source": [ "```{note}\n", "Note the exclamation mark in the line of code below. This means the code will run as a shell (as opposed to a notebook) command.\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "ddaaf27e", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:08:31.667927Z", "iopub.status.busy": "2024-09-12T13:08:31.667499Z", "iopub.status.idle": "2024-09-12T13:08:44.622983Z", "shell.execute_reply": "2024-09-12T13:08:44.621511Z", "shell.execute_reply.started": "2024-09-12T13:08:31.667885Z" } }, "outputs": [], "source": [ "!pip install cdsapi" ] }, { "cell_type": "markdown", "id": "074bf254", "metadata": { "tags": [] }, "source": [ "## Import libraries\n", "\n", "Here we import a number of publicly available Python packages, needed for this tutorial." ] }, { "cell_type": "code", "execution_count": 10, "id": "ccacb4a6", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:08:46.026360Z", "iopub.status.busy": "2024-09-12T13:08:46.025923Z", "iopub.status.idle": "2024-09-12T13:08:46.033269Z", "shell.execute_reply": "2024-09-12T13:08:46.031764Z", "shell.execute_reply.started": "2024-09-12T13:08:46.026318Z" }, "tags": [] }, "outputs": [], "source": [ "# CDS API\n", "import cdsapi\n", "\n", "# Library to extract data\n", "from zipfile import ZipFile\n", "\n", "# Libraries to read and process arrays\n", "import numpy as np\n", "import xarray as xr\n", "import pandas as pd\n", "\n", "# Disable warnings for data download via API\n", "import urllib3 \n", "urllib3.disable_warnings()" ] }, { "cell_type": "markdown", "id": "b66df851", "metadata": { "tags": [] }, "source": [ "## Access data\n", "\n", "To access data from the ADS, you will need first to register (if you have not already done so), by visiting https://ads-beta.atmosphere.copernicus.eu/ and selecting **\"Login/Register\"**\n", "\n", "To obtain data programmatically from the ADS, you will need an API Key. This can be found in the page https://ads-beta.atmosphere.copernicus.eu/how-to-api. Here your key will appear automatically in the black window, assuming you have already registered and logged into the ADS. Your API key is the entire string of characters that appears after `key:`\n", "\n", "Now copy your API key into the code cell below, replacing `#######` with your key." ] }, { "cell_type": "code", "execution_count": 11, "id": "a59f6cd6", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:08:48.998728Z", "iopub.status.busy": "2024-09-12T13:08:48.998296Z", "iopub.status.idle": "2024-09-12T13:08:49.004463Z", "shell.execute_reply": "2024-09-12T13:08:49.003200Z", "shell.execute_reply.started": "2024-09-12T13:08:48.998685Z" } }, "outputs": [], "source": [ "URL = 'https://ads-beta.atmosphere.copernicus.eu/api'\n", "\n", "# Replace the hashtags with your key:\n", "KEY = '#############################'" ] }, { "cell_type": "markdown", "id": "d93e2335", "metadata": { "tags": [] }, "source": [ "Here we specify a data directory into which we will download our data and all output files that we will generate:" ] }, { "cell_type": "code", "execution_count": 12, "id": "aadc02ea", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:08:50.598715Z", "iopub.status.busy": "2024-09-12T13:08:50.598260Z", "iopub.status.idle": "2024-09-12T13:08:50.603819Z", "shell.execute_reply": "2024-09-12T13:08:50.602729Z", "shell.execute_reply.started": "2024-09-12T13:08:50.598674Z" } }, "outputs": [], "source": [ "DATADIR = '.'" ] }, { "cell_type": "markdown", "id": "39d52a42", "metadata": {}, "source": [ "The data we will download and inspect in this tutorial comes from the CAMS Global Atmospheric Composition Forecast dataset. This can be found in the [Atmosphere Data Store (ADS)](https://ads-beta.atmosphere.copernicus.eu/) by scrolling through the datasets, or applying search filters as illustrated here:\n", "\n", "![logo](./img/ADS_search_and_result.png)" ] }, { "cell_type": "markdown", "id": "88d87462", "metadata": {}, "source": [ "Having selected the correct dataset, we now need to specify what product type, variables, temporal and geographic coverage we are interested in. These can all be selected in the **\"Download data\"** tab. In this tab a form appears in which we will select the following parameters to download:\n", "\n", "- Variables (Single level): *Dust aerosol optical depth at 550nm*, *Organic matter aerosol optical depth at 550nm*, *Total aerosol optical depth at 550nm*\n", "- Date: Start: *2021-08-01*, End: *2021-08-08*\n", "- Time: *00:00*, *12:00* (default)\n", "- Leadtime hour: *0* (only analysis)\n", "- Type: *Forecast* (default)\n", "- Area: Restricted area: *North: 90*, *East: 180*, *South: 0*, *West: -180* \n", "- Format: *Zipped netCDF (experimental)*\n", "\n", "At the end of the download form, select **\"Show API request\"**. This will reveal a block of code, which you can simply copy and paste into a cell of your Jupyter Notebook (see cell below)..." ] }, { "cell_type": "markdown", "id": "5ebfe0b2", "metadata": {}, "source": [ "```{note}\n", "Before running this code, ensure that you have **accepted the terms and conditions**. This is something you only need to do once for each CAMS dataset. You will find the option to do this by selecting the dataset in the ADS, then scrolling to the end of the *Download data* tab.\n", "```" ] }, { "cell_type": "code", "execution_count": 13, "id": "a81c5194", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:08:59.852157Z", "iopub.status.busy": "2024-09-12T13:08:59.851739Z", "iopub.status.idle": "2024-09-12T13:11:51.521600Z", "shell.execute_reply": "2024-09-12T13:11:51.519929Z", "shell.execute_reply.started": "2024-09-12T13:08:59.852116Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2024-09-12 13:09:00,125 INFO Request ID is cc9448c4-2415-4c46-a73f-316d98350daf\n", "2024-09-12 13:09:00,184 INFO status has been updated to accepted\n", "2024-09-12 13:09:01,721 INFO status has been updated to running\n", "2024-09-12 13:11:50,751 INFO Creating download object as zip with files:\n", "['data_sfc.nc']\n", "2024-09-12 13:11:50,752 INFO status has been updated to successful\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e1d946fd28fd4cd2bb6891ca8edcadc9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "6b5b7143834942c728fdaee2011a2c60.zip: 0%| | 0.00/24.1M [00:00\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 39MB\n",
       "Dimensions:                  (forecast_period: 1, forecast_reference_time: 16,\n",
       "                              latitude: 226, longitude: 900)\n",
       "Coordinates:\n",
       "  * forecast_period          (forecast_period) timedelta64[ns] 8B 00:00:00\n",
       "  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 128B 20...\n",
       "  * latitude                 (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n",
       "  * longitude                (longitude) float64 7kB -180.0 -179.6 ... 179.6\n",
       "    valid_time               (forecast_reference_time, forecast_period) datetime64[ns] 128B ...\n",
       "Data variables:\n",
       "    duaod550                 (forecast_period, forecast_reference_time, latitude, longitude) float32 13MB ...\n",
       "    omaod550                 (forecast_period, forecast_reference_time, latitude, longitude) float32 13MB ...\n",
       "    aod550                   (forecast_period, forecast_reference_time, latitude, longitude) float32 13MB ...\n",
       "Attributes:\n",
       "    GRIB_centre:             ecmf\n",
       "    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts\n",
       "    GRIB_subCentre:          0\n",
       "    Conventions:             CF-1.7\n",
       "    institution:             European Centre for Medium-Range Weather Forecasts\n",
       "    history:                 2024-09-12T13:11 GRIB to CDM+CF via cfgrib-0.9.1...
" ], "text/plain": [ " Size: 39MB\n", "Dimensions: (forecast_period: 1, forecast_reference_time: 16,\n", " latitude: 226, longitude: 900)\n", "Coordinates:\n", " * forecast_period (forecast_period) timedelta64[ns] 8B 00:00:00\n", " * forecast_reference_time (forecast_reference_time) datetime64[ns] 128B 20...\n", " * latitude (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n", " * longitude (longitude) float64 7kB -180.0 -179.6 ... 179.6\n", " valid_time (forecast_reference_time, forecast_period) datetime64[ns] 128B ...\n", "Data variables:\n", " duaod550 (forecast_period, forecast_reference_time, latitude, longitude) float32 13MB ...\n", " omaod550 (forecast_period, forecast_reference_time, latitude, longitude) float32 13MB ...\n", " aod550 (forecast_period, forecast_reference_time, latitude, longitude) float32 13MB ...\n", "Attributes:\n", " GRIB_centre: ecmf\n", " GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts\n", " GRIB_subCentre: 0\n", " Conventions: CF-1.7\n", " institution: European Centre for Medium-Range Weather Forecasts\n", " history: 2024-09-12T13:11 GRIB to CDM+CF via cfgrib-0.9.1..." ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds" ] }, { "cell_type": "markdown", "id": "6c8603bb", "metadata": {}, "source": [ "We see that the dataset has three variables. Selecting the \"show/hide attributes\" icons reveals their names: **\"omaod550\"** is \"Organic Matter Aerosol Optical Depth at 550nm\", **\"aod550\"** is \"Total Aerosol Optical Depth at 550nm\" and **\"duaod550\"** is \"Dust Aerosol Optical Depth at 550nm\".\n", "The dataset also has four coordinates of **longitude**, **latitude**, **forecast_reference_time** and **forecast_period**.\n", "\n", "We will now look more carefully at the \"Total Aerosol Optical Depth at 550nm\" dataset.\n", "\n", "While an Xarray **dataset** may contain multiple variables, an Xarray **data array** holds a single multi-dimensional variable and its coordinates. To make the processing of the **aod550** data easier, we convert in into an Xarray data array." ] }, { "cell_type": "code", "execution_count": 19, "id": "0885caa2", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:14:29.133546Z", "iopub.status.busy": "2024-09-12T13:14:29.132744Z", "iopub.status.idle": "2024-09-12T13:14:29.157464Z", "shell.execute_reply": "2024-09-12T13:14:29.155989Z", "shell.execute_reply.started": "2024-09-12T13:14:29.133499Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'aod550' (forecast_period: 1, forecast_reference_time: 16,\n",
       "                            latitude: 226, longitude: 900)> Size: 13MB\n",
       "[3254400 values with dtype=float32]\n",
       "Coordinates:\n",
       "  * forecast_period          (forecast_period) timedelta64[ns] 8B 00:00:00\n",
       "  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 128B 20...\n",
       "  * latitude                 (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n",
       "  * longitude                (longitude) float64 7kB -180.0 -179.6 ... 179.6\n",
       "    valid_time               (forecast_reference_time, forecast_period) datetime64[ns] 128B ...\n",
       "Attributes: (12/33)\n",
       "    GRIB_paramId:                             210207\n",
       "    GRIB_dataType:                            fc\n",
       "    GRIB_numberOfPoints:                      203400\n",
       "    GRIB_typeOfLevel:                         surface\n",
       "    GRIB_stepUnits:                           1\n",
       "    GRIB_stepType:                            instant\n",
       "    ...                                       ...\n",
       "    GRIB_units:                               ~\n",
       "    long_name:                                Total Aerosol Optical Depth at ...\n",
       "    units:                                    ~\n",
       "    standard_name:                            unknown\n",
       "    GRIB_number:                              0\n",
       "    GRIB_surface:                             0.0
" ], "text/plain": [ " Size: 13MB\n", "[3254400 values with dtype=float32]\n", "Coordinates:\n", " * forecast_period (forecast_period) timedelta64[ns] 8B 00:00:00\n", " * forecast_reference_time (forecast_reference_time) datetime64[ns] 128B 20...\n", " * latitude (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n", " * longitude (longitude) float64 7kB -180.0 -179.6 ... 179.6\n", " valid_time (forecast_reference_time, forecast_period) datetime64[ns] 128B ...\n", "Attributes: (12/33)\n", " GRIB_paramId: 210207\n", " GRIB_dataType: fc\n", " GRIB_numberOfPoints: 203400\n", " GRIB_typeOfLevel: surface\n", " GRIB_stepUnits: 1\n", " GRIB_stepType: instant\n", " ... ...\n", " GRIB_units: ~\n", " long_name: Total Aerosol Optical Depth at ...\n", " units: ~\n", " standard_name: unknown\n", " GRIB_number: 0\n", " GRIB_surface: 0.0" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create Xarray Data Array\n", "da = ds['aod550']\n", "da" ] }, { "cell_type": "markdown", "id": "6c3bbb88", "metadata": {}, "source": [ "## Subset data\n", "\n", "This section provides some selected examples of ways in which parts of a dataset can be extracted. More comprehensive documentation on how to index and select data is available here: https://docs.xarray.dev/en/stable/user-guide/indexing.html." ] }, { "cell_type": "markdown", "id": "8d617643", "metadata": {}, "source": [ "### Temporal subset" ] }, { "cell_type": "markdown", "id": "6d889ae1", "metadata": {}, "source": [ "By inspecting the array, we notice that the second of the four dimensions is time. If we wish to select only one time step, the easiest way to do this is to use positional indexing. The code below creates a Data Array of only the first time step." ] }, { "cell_type": "code", "execution_count": 20, "id": "133c0364", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:18:06.101582Z", "iopub.status.busy": "2024-09-12T13:18:06.101112Z", "iopub.status.idle": "2024-09-12T13:18:06.124299Z", "shell.execute_reply": "2024-09-12T13:18:06.123263Z", "shell.execute_reply.started": "2024-09-12T13:18:06.101538Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'aod550' (forecast_period: 1, latitude: 226, longitude: 900)> Size: 814kB\n",
       "[203400 values with dtype=float32]\n",
       "Coordinates:\n",
       "  * forecast_period          (forecast_period) timedelta64[ns] 8B 00:00:00\n",
       "    forecast_reference_time  datetime64[ns] 8B 2021-08-01\n",
       "  * latitude                 (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n",
       "  * longitude                (longitude) float64 7kB -180.0 -179.6 ... 179.6\n",
       "    valid_time               (forecast_period) datetime64[ns] 8B ...\n",
       "Attributes: (12/33)\n",
       "    GRIB_paramId:                             210207\n",
       "    GRIB_dataType:                            fc\n",
       "    GRIB_numberOfPoints:                      203400\n",
       "    GRIB_typeOfLevel:                         surface\n",
       "    GRIB_stepUnits:                           1\n",
       "    GRIB_stepType:                            instant\n",
       "    ...                                       ...\n",
       "    GRIB_units:                               ~\n",
       "    long_name:                                Total Aerosol Optical Depth at ...\n",
       "    units:                                    ~\n",
       "    standard_name:                            unknown\n",
       "    GRIB_number:                              0\n",
       "    GRIB_surface:                             0.0
" ], "text/plain": [ " Size: 814kB\n", "[203400 values with dtype=float32]\n", "Coordinates:\n", " * forecast_period (forecast_period) timedelta64[ns] 8B 00:00:00\n", " forecast_reference_time datetime64[ns] 8B 2021-08-01\n", " * latitude (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n", " * longitude (longitude) float64 7kB -180.0 -179.6 ... 179.6\n", " valid_time (forecast_period) datetime64[ns] 8B ...\n", "Attributes: (12/33)\n", " GRIB_paramId: 210207\n", " GRIB_dataType: fc\n", " GRIB_numberOfPoints: 203400\n", " GRIB_typeOfLevel: surface\n", " GRIB_stepUnits: 1\n", " GRIB_stepType: instant\n", " ... ...\n", " GRIB_units: ~\n", " long_name: Total Aerosol Optical Depth at ...\n", " units: ~\n", " standard_name: unknown\n", " GRIB_number: 0\n", " GRIB_surface: 0.0" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time0 = da[:,0,:,:]\n", "time0" ] }, { "cell_type": "markdown", "id": "3b568124", "metadata": {}, "source": [ "And this creates a Data Array of the first 5 time steps:" ] }, { "cell_type": "code", "execution_count": 21, "id": "970f95d5", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:18:41.306068Z", "iopub.status.busy": "2024-09-12T13:18:41.305593Z", "iopub.status.idle": "2024-09-12T13:18:41.329547Z", "shell.execute_reply": "2024-09-12T13:18:41.328227Z", "shell.execute_reply.started": "2024-09-12T13:18:41.306030Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'aod550' (forecast_period: 1, forecast_reference_time: 5,\n",
       "                            latitude: 226, longitude: 900)> Size: 4MB\n",
       "[1017000 values with dtype=float32]\n",
       "Coordinates:\n",
       "  * forecast_period          (forecast_period) timedelta64[ns] 8B 00:00:00\n",
       "  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 40B 202...\n",
       "  * latitude                 (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n",
       "  * longitude                (longitude) float64 7kB -180.0 -179.6 ... 179.6\n",
       "    valid_time               (forecast_reference_time, forecast_period) datetime64[ns] 40B ...\n",
       "Attributes: (12/33)\n",
       "    GRIB_paramId:                             210207\n",
       "    GRIB_dataType:                            fc\n",
       "    GRIB_numberOfPoints:                      203400\n",
       "    GRIB_typeOfLevel:                         surface\n",
       "    GRIB_stepUnits:                           1\n",
       "    GRIB_stepType:                            instant\n",
       "    ...                                       ...\n",
       "    GRIB_units:                               ~\n",
       "    long_name:                                Total Aerosol Optical Depth at ...\n",
       "    units:                                    ~\n",
       "    standard_name:                            unknown\n",
       "    GRIB_number:                              0\n",
       "    GRIB_surface:                             0.0
" ], "text/plain": [ " Size: 4MB\n", "[1017000 values with dtype=float32]\n", "Coordinates:\n", " * forecast_period (forecast_period) timedelta64[ns] 8B 00:00:00\n", " * forecast_reference_time (forecast_reference_time) datetime64[ns] 40B 202...\n", " * latitude (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n", " * longitude (longitude) float64 7kB -180.0 -179.6 ... 179.6\n", " valid_time (forecast_reference_time, forecast_period) datetime64[ns] 40B ...\n", "Attributes: (12/33)\n", " GRIB_paramId: 210207\n", " GRIB_dataType: fc\n", " GRIB_numberOfPoints: 203400\n", " GRIB_typeOfLevel: surface\n", " GRIB_stepUnits: 1\n", " GRIB_stepType: instant\n", " ... ...\n", " GRIB_units: ~\n", " long_name: Total Aerosol Optical Depth at ...\n", " units: ~\n", " standard_name: unknown\n", " GRIB_number: 0\n", " GRIB_surface: 0.0" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time_5steps = da[:,0:5,:,:]\n", "time_5steps" ] }, { "cell_type": "markdown", "id": "8039716b", "metadata": {}, "source": [ "Another way to select data is to use the `.sel()` method of xarray. The example below selects all data from the first of August." ] }, { "cell_type": "code", "execution_count": 22, "id": "1d415903", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:19:27.622635Z", "iopub.status.busy": "2024-09-12T13:19:27.622219Z", "iopub.status.idle": "2024-09-12T13:19:27.640464Z", "shell.execute_reply": "2024-09-12T13:19:27.639200Z", "shell.execute_reply.started": "2024-09-12T13:19:27.622594Z" } }, "outputs": [], "source": [ "firstAug = da.sel(forecast_reference_time='2021-08-01')" ] }, { "cell_type": "markdown", "id": "a2544e3c", "metadata": {}, "source": [ "We can also select a time range using label based indexing, with the `loc` attribute:" ] }, { "cell_type": "code", "execution_count": 26, "id": "bb0d82d4", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:24:22.279698Z", "iopub.status.busy": "2024-09-12T13:24:22.279250Z", "iopub.status.idle": "2024-09-12T13:24:22.317954Z", "shell.execute_reply": "2024-09-12T13:24:22.316605Z", "shell.execute_reply.started": "2024-09-12T13:24:22.279657Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'aod550' (forecast_period: 1, forecast_reference_time: 6,\n",
       "                            latitude: 226, longitude: 900)> Size: 5MB\n",
       "[1220400 values with dtype=float32]\n",
       "Coordinates:\n",
       "  * forecast_period          (forecast_period) timedelta64[ns] 8B 00:00:00\n",
       "  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 48B 202...\n",
       "  * latitude                 (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n",
       "  * longitude                (longitude) float64 7kB -180.0 -179.6 ... 179.6\n",
       "    valid_time               (forecast_reference_time, forecast_period) datetime64[ns] 48B ...\n",
       "Attributes: (12/33)\n",
       "    GRIB_paramId:                             210207\n",
       "    GRIB_dataType:                            fc\n",
       "    GRIB_numberOfPoints:                      203400\n",
       "    GRIB_typeOfLevel:                         surface\n",
       "    GRIB_stepUnits:                           1\n",
       "    GRIB_stepType:                            instant\n",
       "    ...                                       ...\n",
       "    GRIB_units:                               ~\n",
       "    long_name:                                Total Aerosol Optical Depth at ...\n",
       "    units:                                    ~\n",
       "    standard_name:                            unknown\n",
       "    GRIB_number:                              0\n",
       "    GRIB_surface:                             0.0
" ], "text/plain": [ " Size: 5MB\n", "[1220400 values with dtype=float32]\n", "Coordinates:\n", " * forecast_period (forecast_period) timedelta64[ns] 8B 00:00:00\n", " * forecast_reference_time (forecast_reference_time) datetime64[ns] 48B 202...\n", " * latitude (latitude) float64 2kB 90.0 89.6 89.2 ... 0.4 0.0\n", " * longitude (longitude) float64 7kB -180.0 -179.6 ... 179.6\n", " valid_time (forecast_reference_time, forecast_period) datetime64[ns] 48B ...\n", "Attributes: (12/33)\n", " GRIB_paramId: 210207\n", " GRIB_dataType: fc\n", " GRIB_numberOfPoints: 203400\n", " GRIB_typeOfLevel: surface\n", " GRIB_stepUnits: 1\n", " GRIB_stepType: instant\n", " ... ...\n", " GRIB_units: ~\n", " long_name: Total Aerosol Optical Depth at ...\n", " units: ~\n", " standard_name: unknown\n", " GRIB_number: 0\n", " GRIB_surface: 0.0" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "period = da.loc[:,\"2021-08-01\":\"2021-08-03\",:,:]\n", "period" ] }, { "cell_type": "markdown", "id": "1bfb81bd", "metadata": {}, "source": [ "### Geographic subset\n", "\n", "Geographical subsetting works in much the same way as temporal subsetting, with the difference that instead of one dimension we now have two (or even three if we inlcude altitude)." ] }, { "cell_type": "markdown", "id": "ed2038ae", "metadata": {}, "source": [ "#### Select nearest grid cell\n", "\n", "In some cases, we may want to find the geographic grid cell that is situated nearest to a particular location of interest, such as a city. In this case we can use `.sel()`, and make use of the `method` keyword argument, which enables nearest neighbor (inexact) lookups. In the example below, we look for the geographic grid cell nearest to Paris." ] }, { "cell_type": "code", "execution_count": 27, "id": "108ce706", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:24:36.382197Z", "iopub.status.busy": "2024-09-12T13:24:36.381255Z", "iopub.status.idle": "2024-09-12T13:24:36.396280Z", "shell.execute_reply": "2024-09-12T13:24:36.394637Z", "shell.execute_reply.started": "2024-09-12T13:24:36.382151Z" } }, "outputs": [], "source": [ "paris_lat = 48.9\n", "paris_lon = 2.4\n", "\n", "paris = da.sel(latitude=paris_lat, longitude=paris_lon, method='nearest')" ] }, { "cell_type": "code", "execution_count": 28, "id": "5f0f4448", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:24:37.839139Z", "iopub.status.busy": "2024-09-12T13:24:37.838679Z", "iopub.status.idle": "2024-09-12T13:24:37.858452Z", "shell.execute_reply": "2024-09-12T13:24:37.857227Z", "shell.execute_reply.started": "2024-09-12T13:24:37.839093Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'aod550' (forecast_period: 1, forecast_reference_time: 16)> Size: 64B\n",
       "[16 values with dtype=float32]\n",
       "Coordinates:\n",
       "  * forecast_period          (forecast_period) timedelta64[ns] 8B 00:00:00\n",
       "  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 128B 20...\n",
       "    latitude                 float64 8B 48.8\n",
       "    longitude                float64 8B 2.4\n",
       "    valid_time               (forecast_reference_time, forecast_period) datetime64[ns] 128B ...\n",
       "Attributes: (12/33)\n",
       "    GRIB_paramId:                             210207\n",
       "    GRIB_dataType:                            fc\n",
       "    GRIB_numberOfPoints:                      203400\n",
       "    GRIB_typeOfLevel:                         surface\n",
       "    GRIB_stepUnits:                           1\n",
       "    GRIB_stepType:                            instant\n",
       "    ...                                       ...\n",
       "    GRIB_units:                               ~\n",
       "    long_name:                                Total Aerosol Optical Depth at ...\n",
       "    units:                                    ~\n",
       "    standard_name:                            unknown\n",
       "    GRIB_number:                              0\n",
       "    GRIB_surface:                             0.0
" ], "text/plain": [ " Size: 64B\n", "[16 values with dtype=float32]\n", "Coordinates:\n", " * forecast_period (forecast_period) timedelta64[ns] 8B 00:00:00\n", " * forecast_reference_time (forecast_reference_time) datetime64[ns] 128B 20...\n", " latitude float64 8B 48.8\n", " longitude float64 8B 2.4\n", " valid_time (forecast_reference_time, forecast_period) datetime64[ns] 128B ...\n", "Attributes: (12/33)\n", " GRIB_paramId: 210207\n", " GRIB_dataType: fc\n", " GRIB_numberOfPoints: 203400\n", " GRIB_typeOfLevel: surface\n", " GRIB_stepUnits: 1\n", " GRIB_stepType: instant\n", " ... ...\n", " GRIB_units: ~\n", " long_name: Total Aerosol Optical Depth at ...\n", " units: ~\n", " standard_name: unknown\n", " GRIB_number: 0\n", " GRIB_surface: 0.0" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "paris" ] }, { "cell_type": "markdown", "id": "97b13d46", "metadata": {}, "source": [ "#### Regional subset\n", "\n", "Often we may wish to select a regional subset. Note that you can specify a region of interest in the [ADS](https://ads-beta.atmosphere.copernicus.eu/) prior to downloading data. This is more efficient as it reduces the data volume. However, there may be cases when you wish to select a regional subset after download. One way to do this is with the `.where()` function. \n", "\n", "In the previous examples, we have used methods that return a subset of the original data. By default `.where()` maintains the original size of the data, with selected elements masked (which become \"not a number\", or `nan`). Use of the option `drop=True` clips coordinate elements that are fully masked.\n", "\n", "The example below uses `.where()` to select a geographic subset from 30 to 60 degrees latitude. We could also specify longitudinal boundaries, by simply adding further conditions." ] }, { "cell_type": "code", "execution_count": 29, "id": "af74e3bc", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:24:47.629405Z", "iopub.status.busy": "2024-09-12T13:24:47.628988Z", "iopub.status.idle": "2024-09-12T13:24:47.704007Z", "shell.execute_reply": "2024-09-12T13:24:47.702738Z", "shell.execute_reply.started": "2024-09-12T13:24:47.629366Z" } }, "outputs": [], "source": [ "mid_lat = da.where((da.latitude > 30.) & (da.latitude < 60.), drop=True)" ] }, { "cell_type": "markdown", "id": "952e7370", "metadata": {}, "source": [ "## Aggregate data\n", "\n", "Another common task is to aggregate data. This may include reducing hourly data to daily means, minimum, maximum, or other statistical properties. We may wish to apply over one or more dimensions, such as averaging over all latitudes and longitudes to obtain one global value." ] }, { "cell_type": "markdown", "id": "d20037e0", "metadata": {}, "source": [ "### Temporal aggregation\n", "\n", "To aggregate over one or more dimensions, we can apply one of a number of methods to the original dataset, such as `.mean()`, `.min()`, `.max()`, `.median()` and others (see https://docs.xarray.dev/en/stable/api.html#id6 for the full list). \n", "\n", "The example below takes the mean of all time steps. The `keep_attrs` parameter is optional. If set to `True` it will keep the original attributes of the Data Array (i.e. description of variable, units, etc). If set to false, the attributes will be stripped." ] }, { "cell_type": "code", "execution_count": 30, "id": "26d5b430", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:25:29.934199Z", "iopub.status.busy": "2024-09-12T13:25:29.933799Z", "iopub.status.idle": "2024-09-12T13:25:30.075958Z", "shell.execute_reply": "2024-09-12T13:25:30.074820Z", "shell.execute_reply.started": "2024-09-12T13:25:29.934163Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'aod550' (forecast_period: 1, latitude: 226, longitude: 900)> Size: 814kB\n",
       "array([[[0.50736654, 0.50736654, 0.50736654, ..., 0.50736654,\n",
       "         0.50736654, 0.50736654],\n",
       "        [0.46713126, 0.46682942, 0.466529  , ..., 0.46804398,\n",
       "         0.46773863, 0.46743435],\n",
       "        [0.43601978, 0.43541682, 0.4348175 , ..., 0.43785065,\n",
       "         0.43723673, 0.43662643],\n",
       "        ...,\n",
       "        [0.07072866, 0.06997345, 0.06995939, ..., 0.0722326 ,\n",
       "         0.07046068, 0.0704512 ],\n",
       "        [0.07082945, 0.06887189, 0.06728195, ..., 0.07400212,\n",
       "         0.07277834, 0.0719097 ],\n",
       "        [0.06977722, 0.06664259, 0.06410388, ..., 0.07571781,\n",
       "         0.07463253, 0.07250822]]], dtype=float32)\n",
       "Coordinates:\n",
       "  * forecast_period  (forecast_period) timedelta64[ns] 8B 00:00:00\n",
       "  * latitude         (latitude) float64 2kB 90.0 89.6 89.2 88.8 ... 0.8 0.4 0.0\n",
       "  * longitude        (longitude) float64 7kB -180.0 -179.6 ... 179.2 179.6\n",
       "Attributes: (12/33)\n",
       "    GRIB_paramId:                             210207\n",
       "    GRIB_dataType:                            fc\n",
       "    GRIB_numberOfPoints:                      203400\n",
       "    GRIB_typeOfLevel:                         surface\n",
       "    GRIB_stepUnits:                           1\n",
       "    GRIB_stepType:                            instant\n",
       "    ...                                       ...\n",
       "    GRIB_units:                               ~\n",
       "    long_name:                                Total Aerosol Optical Depth at ...\n",
       "    units:                                    ~\n",
       "    standard_name:                            unknown\n",
       "    GRIB_number:                              0\n",
       "    GRIB_surface:                             0.0
" ], "text/plain": [ " Size: 814kB\n", "array([[[0.50736654, 0.50736654, 0.50736654, ..., 0.50736654,\n", " 0.50736654, 0.50736654],\n", " [0.46713126, 0.46682942, 0.466529 , ..., 0.46804398,\n", " 0.46773863, 0.46743435],\n", " [0.43601978, 0.43541682, 0.4348175 , ..., 0.43785065,\n", " 0.43723673, 0.43662643],\n", " ...,\n", " [0.07072866, 0.06997345, 0.06995939, ..., 0.0722326 ,\n", " 0.07046068, 0.0704512 ],\n", " [0.07082945, 0.06887189, 0.06728195, ..., 0.07400212,\n", " 0.07277834, 0.0719097 ],\n", " [0.06977722, 0.06664259, 0.06410388, ..., 0.07571781,\n", " 0.07463253, 0.07250822]]], dtype=float32)\n", "Coordinates:\n", " * forecast_period (forecast_period) timedelta64[ns] 8B 00:00:00\n", " * latitude (latitude) float64 2kB 90.0 89.6 89.2 88.8 ... 0.8 0.4 0.0\n", " * longitude (longitude) float64 7kB -180.0 -179.6 ... 179.2 179.6\n", "Attributes: (12/33)\n", " GRIB_paramId: 210207\n", " GRIB_dataType: fc\n", " GRIB_numberOfPoints: 203400\n", " GRIB_typeOfLevel: surface\n", " GRIB_stepUnits: 1\n", " GRIB_stepType: instant\n", " ... ...\n", " GRIB_units: ~\n", " long_name: Total Aerosol Optical Depth at ...\n", " units: ~\n", " standard_name: unknown\n", " GRIB_number: 0\n", " GRIB_surface: 0.0" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time_mean = da.mean(dim=\"forecast_reference_time\", keep_attrs=True)\n", "time_mean" ] }, { "cell_type": "markdown", "id": "27c70d0c", "metadata": {}, "source": [ "Instead of reducing an entire dimension to one value, we may wish to reduce the frequency within a dimension. For example, we can reduce hourly data to daily max values. One way to do this is using `groupby()` combined with the `.max()` aggregate function, as shown below:" ] }, { "cell_type": "code", "execution_count": 32, "id": "b6e7e8b6", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:26:19.474877Z", "iopub.status.busy": "2024-09-12T13:26:19.474376Z", "iopub.status.idle": "2024-09-12T13:26:19.521150Z", "shell.execute_reply": "2024-09-12T13:26:19.520064Z", "shell.execute_reply.started": "2024-09-12T13:26:19.474836Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'aod550' (forecast_period: 1, day: 8, latitude: 226,\n",
       "                            longitude: 900)> Size: 7MB\n",
       "array([[[[0.26269364, 0.26269364, 0.26269364, ..., 0.26269364,\n",
       "          0.26269364, 0.26269364],\n",
       "         [0.22594188, 0.22530101, 0.22465824, ..., 0.22786163,\n",
       "          0.22722267, 0.22658275],\n",
       "         [0.24543594, 0.2441523 , 0.2428677 , ..., 0.24927448,\n",
       "          0.24799655, 0.24671672],\n",
       "         ...,\n",
       "         [0.0983517 , 0.09476589, 0.08715175, ..., 0.10342334,\n",
       "          0.10067485, 0.09867404],\n",
       "         [0.12430927, 0.11521217, 0.10060379, ..., 0.11608146,\n",
       "          0.11144374, 0.11907455],\n",
       "         [0.11796734, 0.11791679, 0.10847542, ..., 0.11883758,\n",
       "          0.11546157, 0.11625931]],\n",
       "\n",
       "        [[2.568343  , 2.568343  , 2.568343  , ..., 2.568343  ,\n",
       "          2.568343  , 2.568343  ],\n",
       "         [2.6602957 , 2.659777  , 2.6592658 , ..., 2.6618922 ,\n",
       "          2.6613533 , 2.6608207 ],\n",
       "         [2.514293  , 2.5132563 , 2.512235  , ..., 2.5174892 ,\n",
       "          2.5164092 , 2.515344  ],\n",
       "...\n",
       "         [0.09576356, 0.09529388, 0.09411991, ..., 0.08724582,\n",
       "          0.09197176, 0.09367597],\n",
       "         [0.0994662 , 0.09842908, 0.09573305, ..., 0.08806264,\n",
       "          0.09281051, 0.09561813],\n",
       "         [0.09634721, 0.09436309, 0.09271324, ..., 0.08776033,\n",
       "          0.09257209, 0.09482419]],\n",
       "\n",
       "        [[0.09904701, 0.09904701, 0.09904701, ..., 0.09904701,\n",
       "          0.09904701, 0.09904701],\n",
       "         [0.1035803 , 0.10347682, 0.10337335, ..., 0.10388643,\n",
       "          0.10378486, 0.10368282],\n",
       "         [0.11281282, 0.11260682, 0.11240035, ..., 0.11342746,\n",
       "          0.11322337, 0.11301833],\n",
       "         ...,\n",
       "         [0.0636552 , 0.07602721, 0.08213645, ..., 0.0609377 ,\n",
       "          0.05778098, 0.05757457],\n",
       "         [0.05529577, 0.06107312, 0.06196576, ..., 0.06288081,\n",
       "          0.05802709, 0.05498534],\n",
       "         [0.0544017 , 0.05245811, 0.04827911, ..., 0.0706194 ,\n",
       "          0.06566173, 0.05943328]]]], dtype=float32)\n",
       "Coordinates:\n",
       "  * forecast_period  (forecast_period) timedelta64[ns] 8B 00:00:00\n",
       "  * latitude         (latitude) float64 2kB 90.0 89.6 89.2 88.8 ... 0.8 0.4 0.0\n",
       "  * longitude        (longitude) float64 7kB -180.0 -179.6 ... 179.2 179.6\n",
       "  * day              (day) int64 64B 1 2 3 4 5 6 7 8\n",
       "Attributes: (12/33)\n",
       "    GRIB_paramId:                             210207\n",
       "    GRIB_dataType:                            fc\n",
       "    GRIB_numberOfPoints:                      203400\n",
       "    GRIB_typeOfLevel:                         surface\n",
       "    GRIB_stepUnits:                           1\n",
       "    GRIB_stepType:                            instant\n",
       "    ...                                       ...\n",
       "    GRIB_units:                               ~\n",
       "    long_name:                                Total Aerosol Optical Depth at ...\n",
       "    units:                                    ~\n",
       "    standard_name:                            unknown\n",
       "    GRIB_number:                              0\n",
       "    GRIB_surface:                             0.0
" ], "text/plain": [ " Size: 7MB\n", "array([[[[0.26269364, 0.26269364, 0.26269364, ..., 0.26269364,\n", " 0.26269364, 0.26269364],\n", " [0.22594188, 0.22530101, 0.22465824, ..., 0.22786163,\n", " 0.22722267, 0.22658275],\n", " [0.24543594, 0.2441523 , 0.2428677 , ..., 0.24927448,\n", " 0.24799655, 0.24671672],\n", " ...,\n", " [0.0983517 , 0.09476589, 0.08715175, ..., 0.10342334,\n", " 0.10067485, 0.09867404],\n", " [0.12430927, 0.11521217, 0.10060379, ..., 0.11608146,\n", " 0.11144374, 0.11907455],\n", " [0.11796734, 0.11791679, 0.10847542, ..., 0.11883758,\n", " 0.11546157, 0.11625931]],\n", "\n", " [[2.568343 , 2.568343 , 2.568343 , ..., 2.568343 ,\n", " 2.568343 , 2.568343 ],\n", " [2.6602957 , 2.659777 , 2.6592658 , ..., 2.6618922 ,\n", " 2.6613533 , 2.6608207 ],\n", " [2.514293 , 2.5132563 , 2.512235 , ..., 2.5174892 ,\n", " 2.5164092 , 2.515344 ],\n", "...\n", " [0.09576356, 0.09529388, 0.09411991, ..., 0.08724582,\n", " 0.09197176, 0.09367597],\n", " [0.0994662 , 0.09842908, 0.09573305, ..., 0.08806264,\n", " 0.09281051, 0.09561813],\n", " [0.09634721, 0.09436309, 0.09271324, ..., 0.08776033,\n", " 0.09257209, 0.09482419]],\n", "\n", " [[0.09904701, 0.09904701, 0.09904701, ..., 0.09904701,\n", " 0.09904701, 0.09904701],\n", " [0.1035803 , 0.10347682, 0.10337335, ..., 0.10388643,\n", " 0.10378486, 0.10368282],\n", " [0.11281282, 0.11260682, 0.11240035, ..., 0.11342746,\n", " 0.11322337, 0.11301833],\n", " ...,\n", " [0.0636552 , 0.07602721, 0.08213645, ..., 0.0609377 ,\n", " 0.05778098, 0.05757457],\n", " [0.05529577, 0.06107312, 0.06196576, ..., 0.06288081,\n", " 0.05802709, 0.05498534],\n", " [0.0544017 , 0.05245811, 0.04827911, ..., 0.0706194 ,\n", " 0.06566173, 0.05943328]]]], dtype=float32)\n", "Coordinates:\n", " * forecast_period (forecast_period) timedelta64[ns] 8B 00:00:00\n", " * latitude (latitude) float64 2kB 90.0 89.6 89.2 88.8 ... 0.8 0.4 0.0\n", " * longitude (longitude) float64 7kB -180.0 -179.6 ... 179.2 179.6\n", " * day (day) int64 64B 1 2 3 4 5 6 7 8\n", "Attributes: (12/33)\n", " GRIB_paramId: 210207\n", " GRIB_dataType: fc\n", " GRIB_numberOfPoints: 203400\n", " GRIB_typeOfLevel: surface\n", " GRIB_stepUnits: 1\n", " GRIB_stepType: instant\n", " ... ...\n", " GRIB_units: ~\n", " long_name: Total Aerosol Optical Depth at ...\n", " units: ~\n", " standard_name: unknown\n", " GRIB_number: 0\n", " GRIB_surface: 0.0" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "daily_max = da.groupby('forecast_reference_time.day').max(keep_attrs=True)\n", "daily_max" ] }, { "cell_type": "markdown", "id": "a7cc183f", "metadata": {}, "source": [ "### Spatial aggregation\n", "\n", "We can apply the same principles to spatial aggregation. An important consideration when aggregating over latitude is the variation in area that the gridded data represents. To account for this, we would need to calculate the area of each grid cell. A simpler solution however, is to use the cosine of the latitude as a proxy. \n", "\n", "The example below demonstrates how to calculate a spatial average of total AOD, applied to the temporal mean we previously calculated, to obtain a single mean value of total AOD averaged in space and time.\n", "\n", "We first calculate the cosine of the latitudes, having converted these from degrees to radians. We then apply these to the Data Array as weights." ] }, { "cell_type": "code", "execution_count": 33, "id": "2302531c", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:26:31.247588Z", "iopub.status.busy": "2024-09-12T13:26:31.247167Z", "iopub.status.idle": "2024-09-12T13:26:31.257424Z", "shell.execute_reply": "2024-09-12T13:26:31.256389Z", "shell.execute_reply.started": "2024-09-12T13:26:31.247548Z" } }, "outputs": [], "source": [ "weights = np.cos(np.deg2rad(time_mean.latitude))\n", "weights.name = \"weights\"\n", "time_mean_weighted = time_mean.weighted(weights)" ] }, { "cell_type": "markdown", "id": "e6bbf34d", "metadata": {}, "source": [ "Now we apply the aggregate function `.mean()` to obtain a weighted average." ] }, { "cell_type": "code", "execution_count": 34, "id": "aa249af3", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:26:34.993757Z", "iopub.status.busy": "2024-09-12T13:26:34.993343Z", "iopub.status.idle": "2024-09-12T13:26:35.066752Z", "shell.execute_reply": "2024-09-12T13:26:35.065690Z", "shell.execute_reply.started": "2024-09-12T13:26:34.993720Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'aod550' (forecast_period: 1)> Size: 8B\n",
       "array([0.2940043])\n",
       "Coordinates:\n",
       "  * forecast_period  (forecast_period) timedelta64[ns] 8B 00:00:00
" ], "text/plain": [ " Size: 8B\n", "array([0.2940043])\n", "Coordinates:\n", " * forecast_period (forecast_period) timedelta64[ns] 8B 00:00:00" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Total_AOD = time_mean_weighted.mean([\"longitude\", \"latitude\"])\n", "Total_AOD" ] }, { "cell_type": "markdown", "id": "1189c393", "metadata": {}, "source": [ "## Export data\n", "\n", "This section includes a few examples of how to export data." ] }, { "cell_type": "markdown", "id": "91bcd358", "metadata": {}, "source": [ "### Export data as NetCDF\n", "\n", "The code below provides a simple example of how to export data to NetCDF." ] }, { "cell_type": "code", "execution_count": 35, "id": "f44b5cc0", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:26:45.478744Z", "iopub.status.busy": "2024-09-12T13:26:45.478325Z", "iopub.status.idle": "2024-09-12T13:26:45.574055Z", "shell.execute_reply": "2024-09-12T13:26:45.572843Z", "shell.execute_reply.started": "2024-09-12T13:26:45.478707Z" } }, "outputs": [], "source": [ "paris.to_netcdf(f'{DATADIR}/2021-08_AOD_Paris.nc')" ] }, { "cell_type": "markdown", "id": "0cf6d871", "metadata": {}, "source": [ "### Export data as CSV" ] }, { "cell_type": "markdown", "id": "bef10e05", "metadata": {}, "source": [ "You may wish to export this data into a format which enables processing with other tools. A commonly used file format is CSV, or \"Comma Separated Values\", which can be used in software such as Microsoft Excel. This section explains how to export data from an xarray object into CSV. Xarray does not have a function to export directly into CSV, so instead we use the Pandas library. We will read the data into a Pandas Data Frame, then write to a CSV file using a dedicated Pandas function." ] }, { "cell_type": "code", "execution_count": 36, "id": "5e1c9b09", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:26:49.382056Z", "iopub.status.busy": "2024-09-12T13:26:49.381657Z", "iopub.status.idle": "2024-09-12T13:26:49.403747Z", "shell.execute_reply": "2024-09-12T13:26:49.402171Z", "shell.execute_reply.started": "2024-09-12T13:26:49.382021Z" } }, "outputs": [], "source": [ "df = paris.to_dataframe()" ] }, { "cell_type": "code", "execution_count": 37, "id": "762aab7f", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:26:50.593559Z", "iopub.status.busy": "2024-09-12T13:26:50.593153Z", "iopub.status.idle": "2024-09-12T13:26:50.616117Z", "shell.execute_reply": "2024-09-12T13:26:50.614870Z", "shell.execute_reply.started": "2024-09-12T13:26:50.593522Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
latitudelongitudevalid_timeaod550
forecast_periodforecast_reference_time
0 days2021-08-01 00:00:0048.82.42021-08-01 00:00:000.171175
2021-08-01 12:00:0048.82.42021-08-01 12:00:000.264477
2021-08-02 00:00:0048.82.42021-08-02 00:00:000.174434
2021-08-02 12:00:0048.82.42021-08-02 12:00:000.311839
2021-08-03 00:00:0048.82.42021-08-03 00:00:000.370461
2021-08-03 12:00:0048.82.42021-08-03 12:00:000.511861
2021-08-04 00:00:0048.82.42021-08-04 00:00:000.215171
2021-08-04 12:00:0048.82.42021-08-04 12:00:000.155513
2021-08-05 00:00:0048.82.42021-08-05 00:00:000.185829
2021-08-05 12:00:0048.82.42021-08-05 12:00:000.261576
2021-08-06 00:00:0048.82.42021-08-06 00:00:000.210918
2021-08-06 12:00:0048.82.42021-08-06 12:00:000.404165
2021-08-07 00:00:0048.82.42021-08-07 00:00:000.201150
2021-08-07 12:00:0048.82.42021-08-07 12:00:000.322430
2021-08-08 00:00:0048.82.42021-08-08 00:00:000.142454
2021-08-08 12:00:0048.82.42021-08-08 12:00:000.298323
\n", "
" ], "text/plain": [ " latitude longitude \\\n", "forecast_period forecast_reference_time \n", "0 days 2021-08-01 00:00:00 48.8 2.4 \n", " 2021-08-01 12:00:00 48.8 2.4 \n", " 2021-08-02 00:00:00 48.8 2.4 \n", " 2021-08-02 12:00:00 48.8 2.4 \n", " 2021-08-03 00:00:00 48.8 2.4 \n", " 2021-08-03 12:00:00 48.8 2.4 \n", " 2021-08-04 00:00:00 48.8 2.4 \n", " 2021-08-04 12:00:00 48.8 2.4 \n", " 2021-08-05 00:00:00 48.8 2.4 \n", " 2021-08-05 12:00:00 48.8 2.4 \n", " 2021-08-06 00:00:00 48.8 2.4 \n", " 2021-08-06 12:00:00 48.8 2.4 \n", " 2021-08-07 00:00:00 48.8 2.4 \n", " 2021-08-07 12:00:00 48.8 2.4 \n", " 2021-08-08 00:00:00 48.8 2.4 \n", " 2021-08-08 12:00:00 48.8 2.4 \n", "\n", " valid_time aod550 \n", "forecast_period forecast_reference_time \n", "0 days 2021-08-01 00:00:00 2021-08-01 00:00:00 0.171175 \n", " 2021-08-01 12:00:00 2021-08-01 12:00:00 0.264477 \n", " 2021-08-02 00:00:00 2021-08-02 00:00:00 0.174434 \n", " 2021-08-02 12:00:00 2021-08-02 12:00:00 0.311839 \n", " 2021-08-03 00:00:00 2021-08-03 00:00:00 0.370461 \n", " 2021-08-03 12:00:00 2021-08-03 12:00:00 0.511861 \n", " 2021-08-04 00:00:00 2021-08-04 00:00:00 0.215171 \n", " 2021-08-04 12:00:00 2021-08-04 12:00:00 0.155513 \n", " 2021-08-05 00:00:00 2021-08-05 00:00:00 0.185829 \n", " 2021-08-05 12:00:00 2021-08-05 12:00:00 0.261576 \n", " 2021-08-06 00:00:00 2021-08-06 00:00:00 0.210918 \n", " 2021-08-06 12:00:00 2021-08-06 12:00:00 0.404165 \n", " 2021-08-07 00:00:00 2021-08-07 00:00:00 0.201150 \n", " 2021-08-07 12:00:00 2021-08-07 12:00:00 0.322430 \n", " 2021-08-08 00:00:00 2021-08-08 00:00:00 0.142454 \n", " 2021-08-08 12:00:00 2021-08-08 12:00:00 0.298323 " ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "code", "execution_count": 38, "id": "ca9b190f", "metadata": { "execution": { "iopub.execute_input": "2024-09-12T13:27:03.299246Z", "iopub.status.busy": "2024-09-12T13:27:03.298521Z", "iopub.status.idle": "2024-09-12T13:27:03.309915Z", "shell.execute_reply": "2024-09-12T13:27:03.308608Z", "shell.execute_reply.started": "2024-09-12T13:27:03.299205Z" } }, "outputs": [], "source": [ "df.to_csv(f'{DATADIR}/2021-08_AOD_Paris.csv')" ] }, { "cell_type": "markdown", "id": "823be67e", "metadata": {}, "source": [ "### Please see the following tutorials on how to visualise this data in maps, plots and animations!" ] } ], "metadata": { "kaggle": { "accelerator": "none", "dataSources": [], "dockerImageVersionId": 30761, "isGpuEnabled": false, "isInternetEnabled": true, "language": "python", "sourceType": "notebook" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" } }, "nbformat": 4, "nbformat_minor": 5 }