Utility of the Randolph Glacier Inventory (RGI) for regional and global glacier volume estimates

1.3.1. Utility of the Randolph Glacier Inventory (RGI) for regional and global glacier volume estimates#

Production date: 31-05-2025

Dataset version: 7.0

Produced by: Yoni Verhaegen and Philippe Huybrechts (Vrije Universiteit Brussel)

🌍 Use case: Using delineated glacier outlines and glacier extent data around the year 2000 for the estimation of the corresponding regional and global glacier ice volumes#

❓ Quality assessment question#

“What is the temporal distribution of the delineated glacier outlines, nominally provided for the year 2000, and how well can they be used to estimate regional and global glacier ice volumes?”

Glaciers are a major contributor to current global sea-level rise, a resource of fresh water, a potential threat of natural hazards, and an important factor for hydro-power production and runoff, as well as for recreation and tourism. A proper assessment of glacier areas, glacier characteristics, as well as their changes due to warming climatic conditions therefore plays a crucial role in dealing with these issues. In that regard, the “Glaciers distribution data from the Randolph Glacier Inventory (RGI) for year 2000” (here we use version 7.0) dataset on the Climate Data Store (CDS) provides key information with respect to glacier extent and their characteristics. The RGI dataset is a collection of digital glacier and ice cap outlines at the global scale, nominally provided for the year 2000 [1, 2]. The data are available in both vector (a shapefile with polygons of individual glacier outlines) and raster (as gridded data with the aggregated fractional glacier areas per pixel) format. Although the goal of the dataset is to provide glacier outlines of all glaciers on Earth as close as possible to the year 2000, one of the main known issues of the dataset is the fact that the acquisition date of the delineated glacier data varies substantially. This notebook investigates the corresponding temporal distribution of delineated glacier data in the vector version of the dataset and evaluates its implications for the estimation of current global glacier ice volumes using the example of Farinotti et al. (2019) [3].

📢 Quality assessment statements#

These are the key outcomes of this assessment

The RGIv7.0 dataset, available on the Climate Data Store, is the most comprehensive source of glacier outlines globally for around the year 2000. However, it has certain limitations, including variations in delineation dates (some glaciers were delineated before or after 2000) and the fact that the data represent a single snapshot in time, making it unsuitable for assessing temporal changes or climate impacts. These acquired glacier outlines that deviate from 2000 can lead to systematic errors in glacier volume estimates, with likely glacier volume and area underestimations for glaciers delineated after 2000 and overestimations for those delineated earlier.
Despite these limitations, the (area-weighted) mean acquisition date of the glacier outlines is relatively close to 2000 and the systematic error at the global scale is small (<1% of the total global ice volume estimate by Farinotti et al., 2019 [3]), making the dataset reliable to be used as input data for current global glacier volume estimates. However, regional and glacier-specific errors can vary significantly due to factors like strongly deviating delineation dates, varying glacier volume response times, and the presence of a supraglacial debris cover (which can further lead to errors). Users should carefully evaluate these factors for specific applications, including regional or glacier-specific volume estimates and other glaciological or hydrological analyses.

📋 Methodology#

Dataset description#

The dataset of the glaciers distribution on the Climate Data Store (CDS) is an almost complete collection of digital glacier and ice cap outlines and their geometrical/hypsometrical characteristics from various data sources at the global scale [1, 2]. The RGI dataset on the CDS is considered a snapshot of glacier outlines around the year 2000, assembled mainly from satellite images, and is currently the most complete dataset of glacier outlines. Simply stated, the glaciers in the dataset were automatically classified using the distinctive spectral reflectance signatures of bedrock and ice. During post-processing, raw glacier outlines are quality checked and manually corrected if required (e.g. in the case of a supraglacial debris cover). The vector part of the dataset divides the glaciers into separate RGI regions, i.e. there are 19 of those regions (“clusters” of glaciers) in RGIv7.0. The raster version of the data contains aggregated fractional glacier areas for each pixel of 1 by 1 degree but does not provide sufficient information to classify glaciers into distinct RGI regions. For a more detailed description of the data acquisition and processing methods, we refer to the documentation on the CDS and the ECMWF Confluence Wiki (Copernicus Knowledge Base).

Structure and (sub)sections#

1. Data preparation and processing

1.1 Import packages
1.2 Define request and download
1.3 Display and inspect data

2. The (area-weighted) mean year of delineation of glacier data

2.1 Determination of date of delineation for every glacier
2.2 Spatial distribution of date of delineation
2.3 The (area-weighted) mean year of delineation per RGI region

3. Estimation of current global glacier ice volumes with RGI data

3.1 Recent glacier area and mass/volume change trends
3.2 An attempt to quantify potential glacier volume over or underestimation
3.3 Other potential sources of error and uncertainty

4. Short summary and take-home messages

📈 Analysis and results#

1. Data preparation and processing#

1.1 Import packages#

First we load the packages:

1.2 Define request and download#

Then we define requests for download from the CDS:

🚨 The files can be large! Since the data files to be downloaded and manipulated have a considerable size, this may take a couple of minutes.
🚨 Insert the correct RGI version number to be downloaded below:

Next, we download the data:

Downloading and handling glacier extent data from the CDS, this may take a couple of minutes...

100%|██████████| 1/1 [00:00<00:00, 32.78it/s]

Downloading glacier mass change data...

100%|██████████| 1/1 [00:00<00:00, 65.71it/s]

Downloading done.

We define the functions to be used:

1.3 Display and inspect data#

Lastly, we can read and inspect the data. Let us print out the data to inspect its structure:

	rgi_id	o1region	o2region	glims_id	anlys_id	subm_id	src_date	cenlon	cenlat	utm_zone	...	zmin_m	zmax_m	zmed_m	zmean_m	slope_deg	aspect_deg	aspect_sec	dem_source	lmax_m	geometry
index
0	RGI2000-v7.0-G-01-00001	01	01-01	G204091E67414N	392889	624	2008-09-02T00:00:00	-155.909404	67.413726	5	...	1485.317600	1693.623500	1544.66940	1553.94030	17.606613	342.330469	1	COPDEM30	725	POLYGON Z ((-155.90493 67.4163 0, -155.90416 6...
1	RGI2000-v7.0-G-01-00002	01	01-01	G204121E67419N	392890	624	2008-09-02T00:00:00	-155.879114	67.419232	5	...	1278.400600	1394.765700	1328.55660	1330.91500	19.846369	345.367012	1	COPDEM30	484	POLYGON Z ((-155.87867 67.41748 0, -155.87872 ...
2	RGI2000-v7.0-G-01-00003	01	01-01	G204471E67431N	392897	624	2008-09-02T00:00:00	-155.530786	67.431484	5	...	1294.703400	1704.742000	1423.20680	1437.23520	23.690063	13.467490	1	COPDEM30	1099	POLYGON Z ((-155.53113 67.43564 0, -155.53079 ...
3	RGI2000-v7.0-G-01-00004	01	01-01	G204497E67431N	392899	624	2008-09-02T00:00:00	-155.501948	67.430514	5	...	1224.178800	1286.897700	1249.63060	1249.06620	12.782562	42.854332	2	COPDEM30	652	POLYGON Z ((-155.49973 67.43212 0, -155.49931 ...
4	RGI2000-v7.0-G-01-00005	01	01-01	G204521E67429N	392901	624	2008-09-02T00:00:00	-155.478173	67.432873	5	...	1052.440100	1488.129900	1290.89230	1273.78340	17.148510	299.058193	8	COPDEM30	1903	POLYGON Z ((-155.49636 67.43892 0, -155.49583 ...
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
274526	RGI2000-v7.0-G-19-02738	19	19-24	G356783E71154S	1009836	805	2000-02-06T00:00:00	-3.254147	-71.142262	30	...	51.334140	190.581200	119.21578	120.04070	1.384531	325.807697	8	COPDEM90	10105	POLYGON Z ((-2.9804 -71.23218 0, -2.99065 -71....
274527	RGI2000-v7.0-G-19-02739	19	19-24	G001187E70166S	1009760	805	2000-02-15T00:00:00	1.161199	-70.234860	31	...	-2.147076	266.489380	131.61113	135.94368	1.436012	342.096767	1	COPDEM90	24839	POLYGON Z ((1.0094 -70.05421 0, 1.01073 -70.05...
274528	RGI2000-v7.0-G-19-02740	19	19-24	G002050E70629S	1009761	805	2000-03-04T00:00:00	2.039158	-70.630907	31	...	4.757297	107.698586	65.82977	65.03951	1.445538	355.737166	1	COPDEM90	10029	POLYGON Z ((2.09959 -70.67463 0, 2.0919 -70.67...
274529	RGI2000-v7.0-G-19-02741	19	19-24	G002939E70509S	1009762	805	2000-03-04T00:00:00	2.929238	-70.505540	31	...	6.294697	261.422030	156.44052	153.55705	1.594081	310.365845	8	COPDEM90	19935	POLYGON Z ((3.09012 -70.38237 0, 3.09807 -70.3...
274530	RGI2000-v7.0-G-19-02742	19	19-24	G004329E70346S	1009763	805	2000-03-04T00:00:00	4.329046	-70.372802	31	...	-2.018840	352.334080	182.37610	185.75215	1.858386	302.110531	8	COPDEM90	15185	POLYGON Z ((4.34737 -70.5039 0, 4.33046 -70.50...

274531 rows × 29 columns

As can be seen above, the data includes attribute information for each individual glacier (i.e. delineated polygon) in the vector-type dataset. Important for this notebook are the BGNDATE and ENDDATE columns (for version 6.0) or src_date (for version 7.0), which contain information about the time of delineation of the specific glacier. If for some glaciers (part of) the attributes are missing, the data are filled by -9999999. We will use this information below.

2. The (area-weighted) mean year of delineation of glacier data#

2.1 Determination of date of delineation for every glacier#

We can begin to answer the user question by extracting information from the attribute table of the downloaded shapefile. For some glaciers the delineation of the outline is composed from several scenes over multiple years in v6.0, implying that a begin date (attribute “BGNDATE”) and an end date (attribute “ENDDATE”) of the outline delineation is given in the attribute table in YYYYMMDD format (if both are available). For v7.0, the acquisition or delineation date is simply given by “src_date” in YYYY-MM-DD format:

\( year_t = \textstyle\dfrac{BGNDATE_t + ENDDATE_t}{2}\ \) (version 6.0), or

\( year_t = srcdate_t \) (version 7.0)

This results in the following plot:

Show code cell source

Hide code cell source

# Convert dates from string to datetime, and add digitalization year
columns = []
if rgi_version == 6.0:
    for column in ("BGNDATE", "ENDDATE"):
        years = gdf[column].str[:4]
        months = gdf[column].str[4:6].replace("99", "01")
        days = gdf[column].str[6:8].replace("99", "01")
        date = years + months + days
        date = date.where(~date.str.startswith("-"))
        columns.append(pd.to_datetime(date))
elif rgi_version == 7.0:
    for column in ["src_date"]:
        years = gdf[column].str[:4]
        months = gdf[column].str[5:7].replace("99", "01")
        days = gdf[column].str[8:10].replace("99", "01")
        date = years + months + days
        date = date.where(~date.str.startswith("-"))
        columns.append(pd.to_datetime(date))
dates = pd.DataFrame(columns).mean()
year = dates.dt.year.astype("Int64")
gdf["decimal_year"] = year + (dates.dt.dayofyear - 1) / (365 + dates.dt.is_leap_year)
gdf["year"] = year

# Get data from years
size = gdf.set_index("year").groupby("year").size()
size = size.reindex(range(gdf["year"].min(), gdf["year"].max() + 1), fill_value=0)
missing = gdf["year"].isnull().sum()
missing_perc = 100 * (missing / len(gdf["year"]))
# Get some statistics
n_glaciers = len(gdf)
if rgi_version == 6.0:
    total_area = gdf["AREA"].sum()
elif rgi_version == 7.0:
    total_area = gdf["area_km2"].sum()
# Plot data
ax = size.plot.bar(
    figsize=(15, 5),
    grid=True,
    width=0.9,
    ylabel="Number of glaciers sampled",
    xlabel=r"Time of delineation $year_t$",
)
_ = ax.text(
    0.015,
    0.87,
    f"Date of delineation data (origin dates) are completely missing for {missing} glaciers or {missing_perc:.2f}% of the dataset.",
    transform=ax.transAxes,
    bbox={"facecolor": "white", "edgecolor": "black"},
);
_ = ax.text(
    0.015,
    0.935,
    f"A total number of {n_glaciers} glaciers is present in the dataset, covering a total area of {total_area:.2f} km².",
    transform=ax.transAxes,
    bbox={"facecolor": "white", "edgecolor": "black"},
);
ax.grid(color='#95a5a6',linestyle='-',alpha=0.25)
ax.set_title("Frequency of the year of delineation in the RGIv{} dataset".format(rgi_version), fontsize=18);

../../_images/b268b3da643a0e155d0a0de2f3cb4c0da879f608c8956dec385c26742dc3d1ec.png

Figure 1. Time of delineation (date of acquisition) for the glaciers in the glacier extent dataset.

2.2 Spatial distribution of date of delineation#

Now that we calculated the acquisition dates, let us check the spatial distribution of the year of delineation for each glacier in the dataset. We therefore produce a world map where every dot represents a glacier, that is colored according to its year of delineation. Glaciers with missing data are not plotted:

../../_images/9947215261b88fc41556b1b651c053728dee8986e2b0617a416046921087ff07.png

Figure 2. Spatial distribution of the time of delineation (date of acquisition) for the glaciers in the glacier extent dataset.

From the above plots, it becomes clear that not all glaciers are delineated in the year 2000. Although the dataset is intended to be a snapshot of the world’s glaciers as they were near the beginning of the 21\(^{st}\) century, the user must keep in mind that the range of delineation dates is substantial. This means that the outlines do not all lie in between the same time frame, which is a main issue of the dataset. The figures above, however, demonstrate that most outlines originate from the period between 2000 and 2010. Nevertheless, for some areas like the western United States or some peripheral glaciers around northern Greenland and the Antarctic Penninsula, the glacier outlines date back to the 1970s or earlier.

2.3 The (area-weighted) mean year of delineation per RGI region#

Let us now calculate the (area-weighted) mean time of delineation for the 19 different RGI regions (see Figure 5 for the locations of these regions). We do that as follows:

Arithmetic mean: \( \overline{year}_a = \textstyle\dfrac{1}{{n\in \text{region}}}{\sum\limits_{i=1}^{n\in \text{region}} year_{t,i}} \) with \( n \) the total number of glaciers and \(year_t\) the time of delineation for glacier \(i\) as calculated above.
Area-weighted mean: \( \overline{year}_w = \left(\dfrac{1}{\sum\limits_{i=1}^{n\in \text{region}} A_i}\right) \sum\limits_{i=1}^{n\in \text{region}} \left(A_i * year_{t,i}\right) \) where \( A_i \) is the glacier area for glacier \(i\) [km²].

This results in the following plot:

Show code cell source

Hide code cell source

if rgi_version == 6.0:
    # Extract region number from RGIID
    gdf["region"] = gdf["RGIID"].str[6:8].astype(int)

    # Group by region and calculate total area
    grouped = gdf[["year", "decimal_year", "AREA", "region"]].groupby("region")
    ungrouped = gdf[["year", "decimal_year", "AREA", "region"]]
    total_area = grouped["AREA"].sum()

    # Calculate means
    means = {
        "Arithmetic": grouped["decimal_year"].mean(),
        "Area-weighted": grouped.apply(
            weighted_average, "decimal_year", "AREA", include_groups=False
        ),
    }
elif rgi_version == 7.0:
    # Extract region number from o1region
    gdf["region"] = gdf["o1region"].astype(int)

    # Group by region and calculate total area
    grouped = gdf[["year", "decimal_year", "area_km2", "region"]].groupby("region")
    ungrouped = gdf[["year", "decimal_year", "area_km2", "region"]]
    total_area = grouped["area_km2"].sum()

    # Calculate means
    means = {
        "Arithmetic": grouped["decimal_year"].mean(),
        "Area-weighted": grouped.apply(
            weighted_average, "decimal_year", "area_km2", include_groups=False
        ),
    }

# Ensure regions are correctly defined as an array
regions = np.array(list(grouped.groups.keys()))  # Extract regions as a sorted array

# Extract the means
arithmetic_means = means["Arithmetic"]
area_weighted_means = means["Area-weighted"]

# Define bar width and positions
bar_width = 0.4
r1 = regions - bar_width / 2  # Positions for arithmetic bars
r2 = regions + bar_width / 2  # Positions for area-weighted bars

textstr = '\n'.join((
    'The global arithmetic mean year of delineation (RGI v{:.1f}) is {:.1f}.'.format(rgi_version, gdf["decimal_year"].mean()),
    'The global area-weighted mean year of delineation (RGI v{:.1f}) is {:.1f}.'.format(rgi_version, means["Area-weighted"].mean()),
))
print(textstr)

# Create the bar plot
plt.figure(figsize=(12, 6))
plt.bar(r1, arithmetic_means, width=bar_width, color='blue', label='Arithmetic')
plt.bar(r2, area_weighted_means, width=bar_width, color='red', label='Area-weighted')

# Add labels, title, and legend
plt.xlabel("RGI region number")
plt.ylabel(r"Year of delineation $\overline{year}_a$ or $\overline{year}_w$")
plt.title("The (area-weighted) mean year of delineation per RGI region (RGI v{})".format(rgi_version), fontsize=18)
plt.xticks(regions)  # Ensure x-ticks match region numbers
plt.ylim(1975, 2015)  # Replace 1980 and 2020 with your desired range
plt.legend()

# Add grid and show plot
plt.grid(color='#95a5a6', linestyle='-', alpha=0.25, axis='y')  # Grid only on y-axis
plt.tight_layout()
plt.show()

The global arithmetic mean year of delineation (RGI v7.0) is 2001.8.
The global area-weighted mean year of delineation (RGI v7.0) is 2001.7.

../../_images/361ab8da4ac76c3e6217e31ce3752aab5522482225b61605b2f4331144b9befa.png

Figure 3. Arithmetic and area-weighted mean year of delineation for the glaciers in each RGI region of the glacier extent dataset.

We observe that at the global scale, the arithmetic area-weighted average of the year of delineation is close to 2002, which only slightly differs in time from the nominally stated dataset reference time (i.e. 2000). However, the (area-weighted) average of the time of delineation varies notably across the different RGI regions: for RGI region 19, for example, the area-weighted average dates back to the beginning of the 1990s, but for RGI region 1 the average origin dates are closer to the 2010s.

3. Estimation of current global glacier ice volumes with RGI data#

3.1 Recent glacier area and mass/volume change trends#

Going back to our specific use case, it is worth noting that several methods exist in the literature to derive ice thickness and ice volume estimates of glaciers at different spatial scales, of which the most well-known procedures are the volume-area scaling method, numerical modelling, and the estimation of ice thicknesses from surface characteristics and the principle of ice dynamics. These methods, however, strongly rely on the precise determination of the glacier surface area (and hypsometry) as input data to derive total ice volume estimates.

Farinotti et al. (2019), for example, used the RGIv6.0 dataset to derive a global glacier ice volume of 158.17 ± 41.03 × 10³ km³, noting the heavy reliance on accurate glacier surface area data [3]. Other studies, including Huss and Farinotti (2012) and Marzeion et al. (2012), have similarly highlighted input data uncertainties with respect to the glacier area in ice volume estimates [4, 5]. The main concern here is that the glacierized area has decreased significantly during the last several decades, with global glacier area decreasing at rates of -0.18% to -0.34% per year and median retreat rates of 7.4 m per year [6, 7, 8]. These changes suggest a potential overestimation of volumes for glaciers delineated before 2000 and an underestimation elsewhere. The suitability of RGI outlines for current volume estimates therefore primarily depends on (1) the deviation of the time since delineation from 2000 and (2) the significance of glacier area/volume changes during that period, which are influenced by the local climate and specific glacier geometries.

3.2 An attempt to quantify potential glacier volume over or underestimation#

It is difficult to quantify the potential volume over or underestimation due to deviating outlines for each glacier in the dataset. However, we can get an idea of the regional glacier volume under/overestimation for 2000 CE by using other datasets that are available on the CDS. Our goal here is to obtain values for the regional glacier volume change between the year 2000 and the area-weighted mean year of delineation of that region \(\overline{year}_w\) (which is thus equivalent to the potential volume over or underestimation relative to the year 2000), denoted with the symbol \(dV_{\text{region}}^{2000,\overline{\text{year}}_w}\).

To get an idea of these potential glacier volume over- or underestimations, we can make use of the “Glacier mass change gridded data from 1976 to present derived from the Fluctuations of Glaciers Database” dataset that is on the CDS. This dataset estimates the gridded yearly mass change of glacier ice (\(dM/dt\)) in Gt yr\(^{-1}\), which can be converted to glacier volume changes (\(dV/dt\)) with units of km\(^3\) yr\(^{-1}\) of ice by making use of the appropriate density. We thus calculate the mass and volume change between 2000 and the area-weighted average year of delineation for a certain RGI region (\(\overline{year}_w\)) to get a total volume change for that specific RGI region between those two dates \(dV_{\text{region}}^{2000,\overline{year}_w}\). If \({{\overline{year}_w}}\) is a decimal (fractional) year, we use linear interpolation to find the interpolated glacier mass and volume change:

\( dV_{\text{region}}^{2000,\overline{year}_w} \text{ [km}^3\text{]} = \begin{cases} \left(\sum\limits_{i=2000}^{\lceil \overline{year}_w \rceil} \sum\limits^{\substack{\text{x,y} \\ \text{{region}}}} \left(\frac{1000}{\rho_i} \frac{dM}{dt}\right)\right) - \\ \left[ \left( \lceil \overline{year}_w \rceil - \overline{year}_w \right) * \left( \sum\limits^{\substack{\text{x,y} \\ \text{region}}} \left(\frac{1000}{\rho_i} \frac{dM}{dt}\right)_{\lceil \overline{year}_w \rceil} \right) \right] & \text{if } \overline{year}_w > 2000 \\ \left(\sum\limits_{i=\lfloor \overline{year}_w \rfloor}^{2000} \sum\limits^{\substack{\text{x,y} \\ \text{region}}} \left(\frac{1000}{\rho_i} \frac{dM}{dt}\right)\right) - \\ \left[ \left( \overline{year}_w - \lfloor \overline{year}_w \rfloor \right) * \left( \sum\limits^{\substack{\text{x,y} \\ \text{region}}} \left(\frac{1000}{\rho_i} \frac{dM}{dt}\right)_{\lfloor \overline{year}_w \rfloor} \right) \right] & \text{if } \overline{year}_w < 2000 \end{cases} \)

Here, we further assume a density \(\rho_i\) of 850 kg m\(^{-3}\) for the mass to volume conservation of glaciers [9]. This may look like a complicated formula, but the principle is simple: for each RGI region, we simply sum the mass (and hence volume) changes over all spatial pixels between the year 2000 and the area-weighted year of delineation \((\overline{year}_w)\):

If \((\overline{year}_w)\) is an integer, the second part of the formula (after the minus sign) drops out.
If \((\overline{year}_w)\) is not an integer, the formula adjusts for the fractional part of the year:
- If \((\overline{year}_w)\) > 2000, it uses the fraction between \((\overline{year}_w)\) and \((\lceil \overline{year}_w \rceil)\), where \(\lceil \cdot \rceil\) denotes the ceiling function, which rounds a number up to the nearest integer.
- If \((\overline{year}_w)\) < 2000, it uses the fraction between \((\lfloor \overline{year}_w \rfloor)\) and \((\overline{year}_w)\), where \(\lfloor \cdot \rfloor\) denotes the floor function, which rounds a number down to the nearest integer.

Now that we have all information to derive values for \( dV_{\text{region}}^{2000,\overline{year}_w}\), let us apply this to the 19 different RGI regions in the RGIv7.0 data:

Show code cell source

Hide code cell source

if rgi_version == 6.0:
    # Mask data for RGI regions
    regions = gdf["RGIID"].str[6:8].astype(int)
    da = regions.to_xarray().assign_coords(
        lon=gdf["CENLON"].to_xarray(),
        lat=gdf["CENLAT"].to_xarray(),
    )
elif rgi_version == 7.0:
    # Mask data for RGI regions
    regions = gdf["o1region"].astype(int)
    da = regions.to_xarray().assign_coords(
        lon=gdf["cenlon"].to_xarray(),
        lat=gdf["cenlat"].to_xarray(),
    )
regridder = xe.Regridder(da, ds, locstream_in=True, method="nearest_s2d")
mask_2d = regridder(da)
mask = xr.concat(
    [(mask_2d == region).expand_dims(region=[region]) for region in regions.unique()],
    "region",
)
ds = ds.where(mask)

# Compute cumulative fields
ds["time"].attrs |= {"long_name": "Time", "units": "yr"}
for da in ds.data_vars.values():
    da.attrs["units"] += " $yr^{-1}$"
    da.attrs["long_name"] = da.attrs["long_name"].replace("_", " ").title()

# Mass change
cumulative = (ds["glacier_mass_change_gt"].sum(("latitude","longitude"),keep_attrs=True)).cumsum("time")
cumulative.attrs = {
    "units": ds["glacier_mass_change_gt"].attrs["units"].split()[0],
    "long_name": f"Cumulative {ds['glacier_mass_change_gt'].attrs['long_name']}",
}
ds["Cumulative"] = cumulative
    
# Calculate volume under or overestimation
year = 2000
original = ds.reset_coords()["Cumulative"]
time = original["time"]
original["time"] = time.dt.year + (time.dt.dayofyear - 1) / (365 + time.dt.is_leap_year)
interpolated = original.interp(time=means["Area-weighted"].to_xarray())
estimate = (1000/850) * (interpolated - original.sel(time=year)).compute() # using appropriate density from Huss (2013)
# Do the loop
for region, result in zip(estimate.region.values, estimate.values):
    print(
        f"The volume estimate in RGI region {region:>2}"
        f" for the year {year} is"
        f" {'under' if result < 0 else 'over':>5}estimated"
        f" by {abs(float(result)):.2f} km³."
    )
total = estimate.sum()
print(
    f"\nThe volume estimate at the global scale"
    f" for the year {year} is"
    f" {'under' if result <0 else 'over'}estimated"
    f" by {abs(float(total)):.2f} km³."
)

The volume estimate in RGI region  1 for the year 2000 is underestimated by 624.97 km³.
The volume estimate in RGI region  2 for the year 2000 is underestimated by 39.12 km³.
The volume estimate in RGI region  3 for the year 2000 is  overestimated by 5.98 km³.
The volume estimate in RGI region  4 for the year 2000 is underestimated by 23.77 km³.
The volume estimate in RGI region  5 for the year 2000 is underestimated by 21.75 km³.
The volume estimate in RGI region  6 for the year 2000 is underestimated by 6.82 km³.
The volume estimate in RGI region  7 for the year 2000 is underestimated by 101.60 km³.
The volume estimate in RGI region  8 for the year 2000 is underestimated by 13.40 km³.
The volume estimate in RGI region  9 for the year 2000 is underestimated by 86.71 km³.
The volume estimate in RGI region 10 for the year 2000 is underestimated by 4.07 km³.
The volume estimate in RGI region 11 for the year 2000 is underestimated by 5.97 km³.
The volume estimate in RGI region 12 for the year 2000 is underestimated by 0.45 km³.
The volume estimate in RGI region 13 for the year 2000 is underestimated by 10.02 km³.
The volume estimate in RGI region 14 for the year 2000 is underestimated by 3.92 km³.
The volume estimate in RGI region 15 for the year 2000 is underestimated by 10.98 km³.
The volume estimate in RGI region 16 for the year 2000 is underestimated by 0.13 km³.
The volume estimate in RGI region 17 for the year 2000 is underestimated by 41.87 km³.
The volume estimate in RGI region 18 for the year 2000 is  overestimated by 0.26 km³.
The volume estimate in RGI region 19 for the year 2000 is underestimated by 185.05 km³.

The volume estimate at the global scale for the year 2000 is underestimated by 1174.35 km³.

Or when plotted:

../../_images/511d4a65a5a716b76fce5015da309987977a9f10355e3a6017a2974bb7a32d6b.png

Figure 4. Total volume over or underestimation compared to the year 2000 for the glaciers in each RGI region of the glacier extent dataset.

In the plot above, negative values indicate an underestimation. Underestimations of glacier volumes are dominant, which is because the (area-weighted) average origin dates of the outlines are generally later than 2000, together with a general trend of glacier shrinkage during the last several decades [6]. It is clear that especially RGI regions 1 (Alaska) and 19 (Antarctic and Subantarctic) exhibit a relatively large underestimation of the total glacier ice volume, mainly due to their large deviation of the area-weighted average year of delineation to the year 2000, as well as the large glacier ice volume changes during this corresponding period. At the global scale, glacier ice volume estimates would exhibit an underestimation of ca. 1174 km³ due to misdated glacier outlines, which is only ca. 0.75% of the total glacier ice volume estimate by Farinotti et al. (2019) [3]. At the global scale, the above obtained deviation related to the misdated glacier outlines from RGIv7.0 is thus relatively low when compared to the total global glacier ice volume estimate from Farinotti et al. (2019), implying that the dataset is well-suited to estimate current global glacier ice volumes for the year 2000. The corresponding systematic error is thus small, but glacier- and region-dependent.

alternatvie text

Figure 5. The 19 first-order regions of the RGI version 7.0 and glacier locations in red. From: GLIMS RGI.

3.3 Other potential sources of error and uncertainty#

Several additional factors contribute to the uncertainty in glacier data derived from the RGI dataset. These include low-resolution and poor-quality source data, errors by delineators, and challenges such as shadows, clouds, proglacial lakes, seasonal snowfields, and limited multi-temporal satellite imagery. The most significant source of misdelineation is supraglacial debris, which often results in glacier area underestimation in regions with extensive debris cover (e.g. High Mountain Asia and the Caucasus), due to the spectral similarity between debris and surrounding moraines or bedrock [10]. Additionally, ice bodies smaller than 0.01 km², considered below the threshold for ice flow, were excluded, leading to further underestimation of glacier area in some regions [1].

The interpretation of what constitutes a glacier also varies due to the diverse global community that contributed to the dataset, causing inconsistencies in how features like debris-covered glaciers, tributaries, and peripheral ice bodies in Antarctica and Greenland were handled [11]. For example, distinguishing debris-covered ice from bedrock or separating outlet glaciers from main ice sheets remains problematic. However, in the more recent v7.0, only glaciers that are not connected or only weakly connected to the GrIS (as defined by connectivity levels lower than 2 in [13]) are included in the dataset. For the AIS, the seperation remains, in some cases, open to discussion. In the southern Asian/Himalayan region, significant differences between glacier areas in the gridded and vector versions of the dataset are furthermore present [12].

Quantitative error estimates are not provided in the dataset, but [2] noted that relative errors are higher for small glaciers due to their larger outline-to-area ratio. Consequently, the uncertainty in an individual glacier outline depends on the quality of the source material and the specific glacier’s characteristics, even for outlines assumed to represent 2000. Users should exercise caution, particularly in regions with significant debris cover or small glacier features.

4. Short summary and take-home messages#

The glacier area dataset on the CDS aims to capture the spatial distribution and state of glacier outlines representative for the year 2000. However, the date of delineation (i.e. the time for which the outlines are representative) can significantly deviate from 2000, resulting in a systematic error. Nevertheless, the (area-weighted) mean acquisition date of the glacier outlines is relatively close to 2000 at the global scale, although a tendency for delineation dates that are slightly more recent than 2000 is noted.

When quantifying current global glacier volumes with RGI glacier areas as input data, this sytematic error has to be taken into account. The suitability of a RGI outline for current glacier ice volume estimates hence depends on (1) the deviation of the time since delineation from 2000 and (2) the significance of glacier area/volume changes during that period, which are influenced by changes of the local climate and the specific glacier geometries. When compared to the total global ice volume estimate by Farinotti et al. (2019) [3], the resulting systematic error calculated in this notebook is in the order of <1% only. This makes the dataset reliable to be used as input data for current global glacier volume estimates, but the systematic error is glacier- and region-dependent.

Additional factors to take into account are, amongst others, supraglacial debris, which often leads to underestimation of glacier area in debris-covered regions such as High Mountain Asia. Furthermore, although no quantitative error estimates are provided with the data, small glaciers tend to have higher relative errors [2]. When fed into glaciological and/or hydrological models, users are thus advized to take into account these considerations.

ℹ️ If you want to know more#

Key resources#

Glaciers distribution data from the Randolph Glacier Inventory (RGI) for year 2000
Documentation on the CDS and the ECMWF Confluence Wiki
Copernicus climate change indicators: glaciers
RGI website from GLIMS (Global Land Ice Measurements from Space)
C3S EQC custom functions, c3s_eqc_automatic_quality_control prepared by B-Open.

References#

[1] RGI Consortium (2017). Randolph Glacier Inventory – A Dataset of Global Glacier Outlines: Version 6.0: Technical Report, Global Land Ice Measurements from Space, Colorado, USA. Digital Media. doi: 10.7265/N5-RGI-60.
[2] Pfeffer, W. T., Arendt, A. A., Bliss, A., Bolch, T., Cogley, J. G., Gardner, A. S., Hagen, J. O., Hock, R., Kaser, G., Kienholz, C., Miles, E. S., Moholdt, G., Mölg, N., Paul, F., Radić, V., Rastner, P., Raup, B. H., Rich, J., Sharp, M. J., and Glasser, N. (2014). The Randolph Glacier Inventory: A globally complete inventory of glaciers, Journal of Glaciology, 60(221), 537-552. doi: 10.3189/2014JoG13J176.
[3] Farinotti, D., Huss, M., Fürst, J. J., Landmann, J., Machguth, H., Maussion, F., and Pandit, A. (2019). A consensus estimate for the ice thickness distribution of all glaciers on Earth. Nature Geoscience, 12(3), 168-173. doi: 10.1038/s41561-019-0300-3.
[4] Huss, M., and Farinotti, D. (2012). Distributed ice thickness and volume of all glaciers around the globe, Journal of Geophysical Research, 117, F04010. doi: 10.1029/2012JF002523.
[5] Marzeion, B., Jarosch, A. H., and Hofer, M. (2012). Past and future sea-level change from the surface mass balance of glaciers, The Cryosphere, 6, 1295–1322. doi: 10.5194/tc-6-1295-2012.
[6] Zemp, M., Frey, H., Gärtnew-Roer, I., Nussbaumer, S. U., Helzle, M., Paul, F., Haeberli, W., Denzinger, F., Ahlstrøm, A. P., Anderson, B., Bajracharya, S., Baroni, C., Braun, L. N., Cáceres, B. E., Casassa, G., Cobos, G., Dávila, L. R., Delgado Granados, H., Demuth, M. N., Espizua, L., Fischer, A., Fujita, K., Gadek, B., Ghazanfar, A., Hagen, J. O., Holmlund, P., Karimi, N., Li, Z., Pelto, M., Pitte, P., Popovnin, V. V., Portocarrero, C. A., Prinz, R., Sangewar, C. V., Severskiy, I., Sigurdsson, O., Soruco, A., Usubaliev, R., and Vincent, C. (2015). Historically unprecedented global glacier decline in the early 21st century, Journal of Glaciology, 61, 745-762. doi: 10.3189/2015JoG15J017.
[7] Li, Y. J., Ding, Y. J., Shangguan, D. H., and Wang, R. J. (2019). Regional differences in global glacier retreat from 1980 to 2015, Advances in Climate Change Research, 10(4), 203–213. doi: 10.1016/j.accre.2020.03.003.
[8] Leclercq, P. W., Oerlemans, J., Basagic, H. J., Bushueva, I., Cook, A. J., and Le Bris, R. (2014). A data set of worldwide glacier length fluctuations, Cryosphere, 2014(8), 659–672. doi: 10.5194/tc-8-659-2014.
[9] Huss, M. (2013). Density assumptions for converting geodetic glacier volume change to mass change, The Cryosphere, 7, 877–887, doi: 10.5194/tc-7-877-2013.
[10] Paul, F., Barrand, N. E., Baumann, S., Berthier, E., Bolch, T., Casey, K., Frey, H., Joshi, S. P., Konovalov, V., Le Bris, R., Mölg, N., Nosenko, G., Nuth, C., Pope, A., Racoviteanu, A., Rastner, P., Raup, B., Scharrer, K., Steffen, S., and Winsvold, S. (2013). On the Accuracy of Glacier Outlines Derived from Remote-Sensing Data. Annals of Glaciology, 54(63), 171–82. doi: 10.3189/2013AoG63A296.
[11] Hock, R., Maussion, F., Marzeion, B., and Nowicki, S. (2022). What is the global glacier ice volume outside the ice sheets? Journal of Glaciology, 69(273), 204–10. doi: 10.1017/jog.2023.1.
[12] Li, Y. J., Li, F., Shangguan, D. H., Ding, Y. J. (2021). A New Global Gridded Glacier Dataset Based on the Randolph Glacier Inventory Version 6.0. Journal of Glaciology, 67 (2021), 773–76. doi: 10.1017/jog.2021.28.
[13] Rastner, P., Bolch, T., Mölg, N., Machguth, H., Le Bris, R., and Paul, F. (2012). The first complete inventory of the local glaciers and ice caps on Greenland, The Cryosphere, 6, 1483–1495, https://doi.org/10.5194/tc-6-1483-2012

Utility of the Randolph Glacier Inventory (RGI) for regional and global glacier volume estimates

Contents

1.3.1. Utility of the Randolph Glacier Inventory (RGI) for regional and global glacier volume estimates#

🌍 Use case: Using delineated glacier outlines and glacier extent data around the year 2000 for the estimation of the corresponding regional and global glacier ice volumes#

❓ Quality assessment question#

📢 Quality assessment statements#

📋 Methodology#

Dataset description#

Structure and (sub)sections#

📈 Analysis and results#

1. Data preparation and processing#

1.1 Import packages#

1.2 Define request and download#

1.3 Display and inspect data#

2. The (area-weighted) mean year of delineation of glacier data#

2.1 Determination of date of delineation for every glacier#

2.2 Spatial distribution of date of delineation#

2.3 The (area-weighted) mean year of delineation per RGI region#

3. Estimation of current global glacier ice volumes with RGI data#

3.1 Recent glacier area and mass/volume change trends#

3.2 An attempt to quantify potential glacier volume over or underestimation#

3.3 Other potential sources of error and uncertainty#

4. Short summary and take-home messages#

ℹ️ If you want to know more#

Key resources#

References#