Quality assessment of glacier extent data from satellite observations: bias, reliability, and accuracy for glaciological and hydrological applications

1.3.2. Quality assessment of glacier extent data from satellite observations: bias, reliability, and accuracy for glaciological and hydrological applications#

Data stream: satellite (observations)
Quality area: bias, reliability, accuracy (uncertainty)
Application area: glaciological, climatological and/or hydrlogical applications, monitoring and models

Production date: 17-07-2024

Produced by: Yoni Verhaegen and Philippe Huybrechts (Vrije Universiteit Brussel)

🌍 Use case: Using digitized glacier outlines and glacier extent data around the year 2000 CE for the estimation of the corresponding global glacier ice volumes and their sea level equivalent#

❓ Quality assessment question#

“What is the temporal distribution of the digitized glacier outlines, nominally provided for the year 2000 CE, and how well can they be used to estimate corresponding global glacier ice volumes?”

Glaciers are a major contributor to current global sea-level rise, a resource of fresh water, a potential threat of natural hazards, and an important factor for hydropower production and runoff, as well as for recreation and tourism. A proper assessment of glacier areas, glacier characteristics, as well as their changes due to warming climatic conditions therefore plays a crucial role in dealing with these issues. In that regard, the ‘Glaciers distribution data from the Randolph Glacier Inventory (RGI) for year 2000’ dataset provides key information with respect to glacier extent and their characteristics. The RGI dataset is a collection of digital glacier and ice cap outlines at the global scale, nominally provided for the year 2000 CE (RGI Consortium, 2017). The data are available in both vector (a shapefile with polygons of individual glacier outlines) and raster (as gridded data with the aggregated fractional glacier areas per pixel) format. As it is the most complete dataset of glacier outlines with a global coverage, researchers often take this data to represent the current state of all glaciers and it has therefore been used accordingly for an impressive amount of (modelling) studies as a reference dataset. However, although the goal of the dataset is to provide glacier outlines of all glaciers on Earth as close as possible to the year 2000, one of the main known issues of the dataset is the fact that the date of origin of the digitized glacier data varies substantially. This notebook investigates the corresponding temporal distribution of digitized glacier data in the vector version of the dataset and evaluates its implications for the estimation of current global glacier ice volumes using the example of Farinotti et al. (2019).

📢 Quality assessment statement#

The glaciers distribution (‘Randolph Glacier Inventory’ or RGI) dataset on the Climate Data Store (CDS) (here version v6.0 is used) is currently the most complete dataset of glacier outlines in terms of its spatial coverage and is generally considered the main reference dataset to determine the glaciers distribution across the globe around the year 2000 CE. However, the data as available on the CDS have very minor applications as a stand-alone dataset and some limitations have to be taken into account when using the data in combination with additional (external) datasets.
One of the typical problems with the data is the fact that not all digitized glacier information exactly corresponds to data in the year 2000 CE, as is nevertheless stated in the title of the dataset. Although most glaciers are digitized based on data sourcing from the timeframe between 2000 and 2010 CE, certain regions exhibit data that date back further in time, even until the 1940s. For a small percentage of glaciers, dates of digitization are furthermore missing. Other limitations that potentially also affect the suitability of the glacier extent dataset are that the data cannot be used for temporal change or climate change impact assessment (it is a single snapshot in time), and that some data are still subject to poor-quality outlines (e.g. for debris-covered glaciers).
The long aggregation period of digitized glacier data should be kept in mind when determining glacier ice volume estimates based on RGI outlines (e.g. as done by Farinotti et al., 2019). This systematic error in the outlines namely has implications for glacier ice volume estimation: due to the observed worldwide shrinkage of the glacier area over the last several decades (e.g. Li et al., 2019; Zemp et al., 2019), a very likely underestimation of the glacier area and ice volume estimate for 2000 CE occurs for glaciers digitized after 2000 CE, while else a very likely overestimation occurs.
However, an estimate of the corresponding systematic error (at the global scale) related to misdated glacier outlines in the RGIv6.0 obtained in this notebook is found to only comprise a small percentage (<1%) of the total glacier ice volume estimate from Farinotti et al. (2019), implying that the dataset is well-suited to be used for glacier ice volume estimates at the global scale for the year 2000 CE.
The under/overestimation effect is, however, highly glacier (and region) specific, as it depends on e.g. the glacier-specific time of digitization, the glacier volume response time (which is related to the local climatic regime and topographic/geometric glacier features), its debris-covered area (causing errors to the glacier outline), etc. Users should therefore revise case-specific details before computing glacier-specific or region/global-scale volume estimates with RGI outlines. This info can also be important for similar glaciological, hydrological, and climatological applications.

📋 Methodology#

Short description#

The dataset of the glaciers distribution on the Climate Data Store (CDS) is an almost complete collection of digital glacier and ice cap outlines and their geometrical/hypsometrical characteristics from various data sources at the global scale (RGI Consortium, 2017). The dataset on the CDS is considered a snapshot of glacier outlines around the year 2000 CE, assembled mainly from satellite images, and is based on the Randolph Glacier Inventory (RGI), which is currently the most complete dataset of glacier outlines. Simply stated, the glaciers in the dataset were automatically classified using the distinctive spectral reflectance signatures of snow and ice. During post-processing, raw glacier outlines are quality checked and manually corrected if required (e.g. in the case of a supraglacial debris cover). It also divides the glaciers into separate RGI regions, i.e. there are 19 of those regions (“clusters” of glaciers) in RGIv6.0. For a more detailed description of the data acquisition and processing methods, we refer to the documentation on the CDS or the ECMWF Confluence Wiki. The resulting glacier extent data can be downloaded in vector topology (used here) or as raster-based data.

Structure and (sub)sections#

In this notebook, the applicability of RGIv6.0 data to serve as an input for current global glacier ice volumes will be assessed by analyzing the temporal distribution of the dates of origin of the digitized glacier data, discussing the potential limitations and error sources of the dataset, and evaluating the implications for estimating current global glacier ice volumes based upon the example of Farinotti et al. (2019). Potential volume over/underestimation due to misdated glacier outlines will furthermore be quantified for each RGI region separately with the global glacier mass change data that are on the CDS. The structure is as follows:

Data preparation and processing: this section loads packages, defines requests for download from the CDS, downloads the actual data and inspects the data to reveal its structure. Also the functions that are used in this notebook are defined in this section.
The (area-weighted) mean year of digitization of glacier data: this section assesses how close the data of each of the glaciers is situated to the year 2000 CE. The spatial distribution of the year of digitization is therefore analysed and plotted for each glacier. The arithmetic and area-weighted mean year of digitization are calculated for all RGI regions.
Other potential sources of error and uncertainty: this section discusses additional limitations of the dataset, apart from the long aggregation period discussed earlier, that can be of importance for this specific use case.
An attempt to quantify potential glacier volume and area over or underestimation due to misdated outlines: this section estimates the potential over/underestimation of glacier surface areas and glacier ice volumes due to misdated outlines for each RGI region. Hence, based on observed volume changes between 2000 CE and the area-weighted mean year of digitization, we project glacier areas back to the year 2000 CE using area-volume scaling. We therefore additionally make use of the “glacier mass change” dataset on the CDS. This section thus discusses and quantifies the suitability of the glacier extent data for the specific use case and question.

📈 Analysis and results#

⏬ Data preparation and processing#

First we load the packages:

Then we define requests for download from the CDS:

Next, we download the data.

🚨 The files can be large! Since the data files to be downloaded and manipulated have a considerable size, this may therefore take a couple of minutes.

Downloading and handling glacier extent data from the CDS, this may take a couple of minutes...

100%|██████████| 1/1 [00:04<00:00,  4.64s/it]

Downloading glacier mass change data from the CDS...

100%|██████████| 1/1 [00:00<00:00,  1.02it/s]

Download completed.

We define the functions to be used:

Lastly, we can read and inspect the data. Let us print out the data to inspect its structure:

	C3S_ID	RGIID	GLIMSID	BGNDATE	ENDDATE	CENLON	CENLAT	O1REGION	O2REGION	AREA	ZMIN	ZMAX	ZMED	SLOPE	ASPECT	LMAX	NAME	geometry
index
0	C3S_000001	RGI60-01.00001	G213177E63689N	20090703	-9999999	-146.8230	63.6890	1	2	0.360	1936	2725	2385	42.0	346	839	None	POLYGON ((-146.81803943799997 63.6908060250000...
1	C3S_000002	RGI60-01.00002	G213332E63404N	20090703	-9999999	-146.6680	63.4040	1	2	0.558	1713	2144	2005	16.0	162	1197	None	POLYGON ((-146.66353681599998 63.4076384990000...
2	C3S_000003	RGI60-01.00003	G213920E63376N	20090703	-9999999	-146.0800	63.3760	1	2	1.685	1609	2182	1868	18.0	175	2106	None	POLYGON ((-146.07231695599998 63.3834750890000...
3	C3S_000004	RGI60-01.00004	G213880E63381N	20090703	-9999999	-146.1200	63.3810	1	2	3.681	1273	2317	1944	19.0	195	4175	None	POLYGON ((-146.14895308999996 63.3791882220000...
4	C3S_000005	RGI60-01.00005	G212943E63551N	20090703	-9999999	-147.0570	63.5510	1	2	2.573	1494	2317	1914	16.0	181	2981	None	POLYGON ((-147.04306686499996 63.5502401350000...
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
216424	C3S_216425	RGI60-19.02748	G322268E53986S	20020502	-9999999	-37.7325	-53.9860	19	3	0.042	310	510	-999	29.9	315	255	None	POLYGON ((-37.73275171199998 -53.9877898439999...
216425	C3S_216426	RGI60-19.02749	G323864E54831S	20030207	-9999999	-36.1361	-54.8310	19	3	0.567	330	830	-999	23.6	200	1130	None	POLYGON ((-36.138344003999975 -54.827345910999...
216426	C3S_216427	RGI60-19.02750	G322698E54188S	20030207	-9999999	-37.3018	-54.1884	19	3	4.118	10	1110	-999	16.8	308	4329	None	POLYGON ((-37.29308757999996 -54.1750637379999...
216427	C3S_216428	RGI60-19.02751	G269573E68866S	19870101	-9999999	-90.4266	-68.8656	19	1	0.011	170	270	-999	0.4	122	106	AQ6C10200013	POLYGON ((-90.42750687199998 -68.8648979559999...
216428	C3S_216429	RGI60-19.02752	G037714E46897S	19660301	-9999999	37.7140	-46.8972	19	4	0.528	970	1170	-999	9.6	35	-9	ZA6C40100001 Ice Plateau	POLYGON ((37.71620283400006 -46.89402806099997...

216429 rows × 18 columns

As can be seen above, the data includes attribute information for each individual glacier (i.e. digitized polygon) in the vector-type dataset. Important for this notebook are the BGNDATE and ENDDATE columns (YYYYMMDD format), which contain information about the time of digitization of the specific glacier. As can already be seen, for some glaciers (part of) these data are missing and filled by -9999999. We will use this information below.

📊 The (area-weighted) mean year of digitization of glacier data#

We can begin to answer the user question by extracting information from the attribute table of the downloaded shapefile. For some glaciers the outline is composed from several scenes over multiple years implying that a begin date (attribute “BGNDATE”) and an end date (attribute “ENDDATE”) is given in the attribute table in YYYYMMDD format (if both are available). For these glaciers, we calculate the mean of the begin and end year:

\( year_t = \textstyle\dfrac{BGNDATE_t + ENDDATE_t}{2}\ \)

For other glaciers, we take the BGNDATE only:

\( year_ t = BGNDATE_t \)

This results in the following plot:

../../_images/0874e54fe998ecc18769a32be2a3b233b93c50765653fe0e343ed581debdb01b.png

Let us check the spatial distribution of the year of digitization for each glacier in the dataset. We therefore produce a world map where every dot represents a glacier, that is colored according to its year of digitization. Glaciers with missing data are not plotted:

../../_images/761ee25bf21c49e13c6b1b4617d4edde48cfa1d689676b4cc146579f869efc33.png

From the above plots, it becomes clear that not all glaciers are digitized in the year 2000 CE. Although the dataset is intended to be a snapshot of the world’s glaciers as they were near the beginning of the 21\(^{st}\) century, the user must keep in mind that the range of digitization dates is substantial. This means that the outlines do not all lie in between the same time frame, which is a main issue of the dataset. The figures above demonstrate that most outlines originate from the period between 2000 and 2010 CE. However, for some areas like New Zealand, the Caucasus or the western United States, the glacier outlines date back to the 1970s or earlier. The dataset is intended to be representative for “around the year 2000 CE”, while in fact the temporal coverage of the outlines from individual glacier ranges from the 1940s into the 2010s CE. The large heterogeneity with respect to the year of digitization of the glacier outlines is a major limitation of the dataset and the user should take note of that before using a specific glacier outline.

We can also quantify this and calculate the average and area-weighted average time of digitization, for the dataset as a whole but also for each of the different 19 RGI regions. For easier comparison, let us check where the different RGI regions are situated:

../../_images/84c12134e790eaabefda0e34bacfe414241ca6b61de9c177a2ecf5e22d38c551.png

Let us now calculate the (area-weighted) mean time of digitization for all those regions:

Arithmetic mean: \( \overline{year}_a = \textstyle\dfrac{1}{{n\in \text{region}}}{\sum\limits_{i=1}^{n\in \text{region}} year_{t,i}} \) with \( n \) the total number of glaciers and \(year_t\) the time of digitization for glacier \(i\) as calculated above.
Area-weighted mean: \( \overline{year}_w = \left(\dfrac{1}{\sum\limits_{i=1}^{n\in \text{region}} A_i}\right) \sum\limits_{i=1}^{n\in \text{region}} \left(A_i \cdot year_{t,i}\right) \) where \( A_i \) is the glacier area for glacier \(i\) [km²].

This results in the following plot:

../../_images/38bbb13a4c1942275cc255bf6b5a78f80cf595a80d2726a34a2f588bcc407ea0.png

We observe that at the global scale, the area-weighted average of the year of digitization is close to 2001 CE, which only slightly differs in time from the nominally stated dataset reference time (i.e. 2000 CE). However, the (area-weighted) average of the time of digitization varies notably across the different RGI regions: for New Zealand (RGI region 18), for example, the area-weighted average is 1978 CE, for Alaska (RGI region 1) it is close to 2009 CE.

🚨 Other potential sources of error and uncertainty#

Apart from a substantial deviation of the derived glacier data to the year 2000 CE, other potential sources of uncertainty include a low resolution and quality of the source data, a poor skill of the digitizer, the presence of shadow or clouds on the satellite images, poor availability of multi-temporal satellite images, the presence of proglacial lakes, and the presence of seasonal snow fields. The largest source of misdelineation of glaciers is, however, the presence of supraglacial debris. For glaciers with extensive debris cover or for rock glaciers, the accuracy can therefore be significantly impacted (Paul et al., 2013). This is related to the fact that the spectral signature of debris is very similar to that of the surrounding moraines and bedrock. In regions with large debris cover (e.g. High Mountain Asia or the Caucasus), the user should be aware that the glacier area can be significantly underestimated due to the presence of debris.

Another issue of the dataset is the unclear interpretation of what a glacier is. As the glacier outlines were created by a globally diverse community, the interpretation of what a glacier or ice cap is (do ice walls need to be included, how do tributaries need to be treated, which type of glaciers should be added, how to divide glacier complexes into separate entities, from which threshold size onwards can we speak of a glacier) deviates among data sources. In this context, debris-covered/rock glaciers (where it is difficult to distinguish debris-covered ice from bedrock) and also the peripheral glaciers of Antarctica and Greenland (where it is difficult to distinguish (outlet) glaciers from the main ice sheet bodies) remain a struggle (Hock et al., 2022). Specific attention of the user is therefore required in these areas. Ice bodies smaller than a regionally variable threshold size (typically in the order of 0.01 km\(^2\)) were furthermore not added (as being the supposed minimum size for ice flow and also to exclude (seasonal) snow fields). This implies that in some regions, glaciers smaller than this threshold were not included in the dataset, which leads to an underestimation of the glacier surface (RGI Consortium, 2017). The total number of glaciers (and the total surface area covered by glaciers) in the dataset is therefore a rather arbitrary quantity. Further investigation also reveals that especially glacier surface extent data in RGI region 14 and 15 (South Asia West and South Asia East) are significantly underestimated in the gridded version of the dataset when compared to the vector counterpart used in this notebook (i.e. up to >50% in RGI region 15 (Li et al., 2021)). Although this is out of scope for this specific notebook, the user should take extra care in these regions.

Quantitative estimates of the errors and the related uncertainty are not given in the data. However, an error model can be found in Pfeffer et al. (2014). Here, it is stated that the relative error of the glacier area is larger for small glaciers than for large glaciers because small glaciers have a larger outline to area ratio. Based on the above, the uncertainty of an individual glacier outline, supposingly being representative for 2000 CE, therefore depends upon both the quality of the source material and the characteristics of the specific glacier.

✅ How well can the data be used to estimate current global glacier ice volumes?#

Going back to our specific use case, it is worth noting that several methods exist in the literature to derive ice thickness and ice volume estimates of glaciers at different spatial scales, of which the most well-known procedures are the volume-area scaling method, numerical modelling, and the estimation of ice thicknesses from surface characteristics and the principle of ice dynamics. These methods, however, strongly rely on the precise determination of the glacier surface area (and hypsometry) to derive total ice volume estimates. Farinotti et al. (2019) used the latter method to derive a consensus estimate of global ice thickness values and the current global glacier ice volume for all glaciers digitized in the RGIv6.0 dataset. They found a total global glacier ice volume of 158.17 ± 41.03 × 10³ km³. The authors stated that the quality of their ice volume estimates highly depends upon the quality of the input data. Other studies using (earlier versions of) the RGI glacier extent data to derive regional or global glacier ice volumes report the same concerns and the same general sources of uncertainty in the input data (e.g. Huss and Farinotti, 2012; Marzeion et al., 2012; Milan et al., 2022).

Although the RGIv6.0 data do not include information on temporal changes of the glacier extent, Zemp et al. (2015), Li et al. (2019) and Zemp et al. (2019) concluded that glacier retreat has been dominant during the last several decades, with the 21\(^{st}\) century marking the historical minimum glacier extent in almost all regions across the globe since at least the 1500s. Results from Li et al. (2019) show that during the period 1980–2015, the rate of global glacier area change was -0.18% per year. Zemp et al. (2019) estimates this number to be -0.34% per year between 1961-2016. From a global record of 471 glacier length change data, Leclerq et al. (2014) furthermore found a median retreat rate over the 1961-2000 period of 7.4 meter per year. These findings indicate rapidly changing glacier conditions (and outlines), with observed glacier shrinkage and retreat in every RGI region during the last several decades (Li et al., 2019). The observed changes of the glacier area from the literature imply a very likely overestimation of the volume estimate for glaciers digitized before 2000 CE (e.g. New Zealand), and an underestimation elsewhere (e.g. Svalbard). The actual retreat and shrinkage (i.e. volume response time) of individual glaciers, however, is dictated by both geometric/topographic factors, such as the slope and the initial glacier size, and on local climatic conditions (Zemp et al., 2015). Eventually, the suitability of RGI outlines to be used for current glacier ice volume estimation therefore depends on the magnitude of the deviation of the time of digitization from the year 2000 CE and the glacier area change, that would have potentially affected the outlines, between 2000 CE and this deviation.

Some studies have also investigated the choice of the glacier inventory on glacier ice volume estimates. For example, Li et al. (2022) found differences in ice volume of 2 to 8% in the Tien Shan when comparing two different glacier inventories, one of them being the RGIv6.0. Van Wyk de Vries et al. (2022) concluded that RGIv6.0 polygons overestimate the spatial extent of the majority of glaciers in the Northern Andes, likely due to rapid recent glacier recession, leading them to correct the outlines for their estimates. Bahr and Radić (2012) furthermore found that the omission of glaciers <0.01 km² (as done in the RGIv6.0) can lead to errors in regional ice volume of up to 10%. At last, Huss and Farinotti (2012) quantified the bulk uncertainties in their ice volume calculation due to imperfections in the (earlier-version) RGI shapes as varying between 1 and 8%, depending on the RGI region. Farinotti et al. (2019) did not perform any sensitivity experiment regarding the glacier outlines.

📈 An attempt to quantify potential glacier volume and surface area over or underestimation due to misdated outlines#

It is difficult to quantify the potential volume and area over or underestimation due to misdated outlines for each glacier in the dataset. However, we can get an idea of the regional glacier area and volume under/overestimation for 2000 CE through a common technique called volume-area scaling for glaciers (e.g. Bahr et al., 1997). The goal is thus to obtain values for the regional glacier volume and glacier surface area change between the year 2000 CE and the area-weighted mean year of digitization of that region \(\overline{year}_w\) (or thus the over or underestimation of the RGI data relative to the year 2000 CE, denoted with the symbols \(dV_{\text{region}}^{2000,\overline{\text{year}}_w}\) and \( dA_{\text{region}}^{2000,\overline{\text{year}}_w}\) respectively). The starting point is the following formula, which scales the glacier surface area \(A_{\text{region}}\) [km\(^2\)] to the total glacier volume \(V_{\text{region}}\) [km\(^3\)] through a power law:

\( V_{\text{region}} \) [km³] \( = cA_{\text{region}}^{\gamma} \)

where \(c\) [km\(^{3-2{\gamma}}\)] and \(\gamma\) [-] are constants (Bahr et al., 1997).

If we know the glacier volume change (or over or underestimation) relative to 2000 CE \(dV_{\text{region}}^{2000,\overline{year}_w}\), we can convert this information into glacier area changes (or over or underestimation) relative to 2000 CE \(dA_{\text{region}}^{2000,\overline{year}_w}\) according to the derivative of the power law:

\( \dfrac{dV_{\text{region}}^{2000,\overline{year}_w}}{dA_{\text{region}}^{2000,\overline{year}_w}} = c{\gamma}A_{\text{region}}^{{\gamma}-1} \)

or when rearranged:

\( dA_{\text{region}}^{2000,\overline{year}_w} \) [km²] \(= \dfrac{dV_{\text{region}}^{2000,\overline{year}_w}}{c{\gamma}A_{\text{region}}^{{\gamma}-1}}\)

In the formula’s above, what we eventually want to know is \( dA_{\text{region}}^{2000,\overline{year}_w}\) (the glacier surface area over or underestimation, or in other words, the glacier area change relative to the year 2000 in a certain RGI region), and what we already know now is:

\(V_{\text{region}}\) [km\(^3\)] around 2000 CE from Farinotti et al. (2019) for a certain RGI region.
\(A_{\text{region}}\) [km\(^2\)] around 2000 CE from the RGI dataset on the CDS for a certain RGI region.

What we now still need is:

Values for \(c\) [km\(^{3-2{\gamma}}\)] and \({\gamma}\) [-].
\(dV_{\text{region}}^{2000,\overline{year}_w}\) [km\(^3\)] or the glacier volume change (over or underestimation) between \(\overline{year}_w\) and the year 2000 CE for a certain RGI region.

To get an idea of these glacier volume changes \(dV_{\text{region}}^{2000,\overline{year}_w}\), we can make use of the ‘Glacier mass change gridded data from 1976 to present derived from the Fluctuations of Glaciers Database’ dataset that is on the CDS. This dataset estimates the gridded yearly mass change of glacier ice \(dM/dt\) in Gt yr\(^{-1}\), which can be converted to glacier volume changes \(dV/dt\) with units of km\(^3\) yr\(^{-1}\) of ice. We therefore calculate the volume change between 2000 CE and the area-weighted average year of digitization for a certain RGI region (\(\overline{year}_w\), as calculated above), and then sum these values to get a total volume change for that specific RGI region \(dV_{\text{region}}^{2000,\overline{year}_w}\). The variable \(dV_{\text{region}}^{2000,\overline{year}_w}\) is thus the total glacier volume change in a certain RGI region between 2000 CE and the area-weighted mean time of glacier digitization \(\overline{year}_w\) for that specific region. If \({{\overline{year}_w}}\) is a decimal (fractional) year, we use linear interpolation to find the interpolated glacier mass and volume change:

\( dV_{\text{region}}^{2000,\overline{year}_w} \text{ [km}^3\text{]} = \begin{cases} \left(\sum\limits_{i=2000}^{\lceil \overline{year}_w \rceil} \sum\limits^{\substack{\text{x,y} \\ \text{{region}}}} \left(\frac{1000}{\rho_i} \frac{dM}{dt}\right)\right) - \left[ \left( \lceil \overline{year}_w \rceil - \overline{year}_w \right) \cdot \left( \sum\limits^{\substack{\text{x,y} \\ \text{region}}} \left(\frac{1000}{\rho_i} \frac{dM}{dt}\right)_{\lceil \overline{year}_w \rceil} \right) \right] & \text{if } \overline{year}_w > 2000 \\ \left(\sum\limits_{i=\lfloor \overline{year}_w \rfloor}^{2000} \sum\limits^{\substack{\text{x,y} \\ \text{region}}} \left(\frac{1000}{\rho_i} \frac{dM}{dt}\right)\right) - \left[ \left( \overline{year}_w - \lfloor \overline{year}_w \rfloor \right) \cdot \left( \sum\limits^{\substack{\text{x,y} \\ \text{region}}} \left(\frac{1000}{\rho_i} \frac{dM}{dt}\right)_{\lfloor \overline{year}_w \rfloor} \right) \right] & \text{if } \overline{year}_w < 2000 \end{cases} \)

Here, we further assume a density \(\rho_i\) of 850 kg m\(^{-3}\) for the mass to volume conservation of glaciers (Huss, 2013). This may look like a complicated formula, but the principle is simple: we sum the mass (and hence volume) changes over all spatial pixels between the year 2000 CE and the area-weighted year of digitization \((\overline{year}_w)\):

If \((\overline{year}_w)\) is an integer, the second part of the formula (after the minus sign) drops out.
If \((\overline{year}_w)\) is not an integer, the formula adjusts for the fractional part of the year:
- If \((\overline{year}_w)\) > 2000, it uses the fraction between \((\overline{year}_w)\) and \((\lceil \overline{year}_w \rceil)\), where \(\lceil \cdot \rceil\) denotes the ceiling function, which rounds a number up to the nearest integer.
- If \((\overline{year}_w)\) < 2000, it uses the fraction between \((\lfloor \overline{year}_w \rfloor)\) and \((\overline{year}_w)\), where \(\lfloor \cdot \rfloor\) denotes the floor function, which rounds a number down to the nearest integer.

This fractional adjustment is calculated by linear interpolation to account for the partial year.

Now we have all information to derive values for \( dV_{\text{region}}^{2000,\overline{year}_w}\). Let us apply this to the 19 different RGI regions in the RGIv6.0 dataset:

The volume estimate in RGI region  1 for the year 2000 CE is underestimated by 599.82 km³.
The volume estimate in RGI region  2 for the year 2000 CE is underestimated by 25.83 km³.
The volume estimate in RGI region  3 for the year 2000 CE is underestimated by 14.31 km³.
The volume estimate in RGI region  4 for the year 2000 CE is underestimated by 32.56 km³.
The volume estimate in RGI region  5 for the year 2000 CE is underestimated by 50.51 km³.
The volume estimate in RGI region  6 for the year 2000 CE is underestimated by 0.09 km³.
The volume estimate in RGI region  7 for the year 2000 CE is underestimated by 100.18 km³.
The volume estimate in RGI region  8 for the year 2000 CE is underestimated by 12.76 km³.
The volume estimate in RGI region  9 for the year 2000 CE is underestimated by 78.40 km³.
The volume estimate in RGI region 10 for the year 2000 CE is underestimated by 11.96 km³.
The volume estimate in RGI region 11 for the year 2000 CE is underestimated by 5.30 km³.
The volume estimate in RGI region 12 for the year 2000 CE is  overestimated by 2.12 km³.
The volume estimate in RGI region 13 for the year 2000 CE is underestimated by 63.60 km³.
The volume estimate in RGI region 14 for the year 2000 CE is underestimated by 3.90 km³.
The volume estimate in RGI region 15 for the year 2000 CE is underestimated by 27.86 km³.
The volume estimate in RGI region 16 for the year 2000 CE is underestimated by 4.33 km³.
The volume estimate in RGI region 17 for the year 2000 CE is underestimated by 15.78 km³.
The volume estimate in RGI region 18 for the year 2000 CE is  overestimated by 3.75 km³.
The volume estimate in RGI region 19 for the year 2000 CE is underestimated by 248.11 km³.

The volume estimate at the global scale for the year 2000 CE is underestimated by 1289.42 km³.

Or when plotted:

../../_images/5b3afddf387e9aec6337a2876b0c24efae2a613774837539c0645d694e8569e7.png

In the plot above, negative values indicate an underestimation. It is clear that especially RGI regions 1 (Alaska) and 19 (Antarctic and Subantarctic) exhibit a relatively large underestimation of the total glacier ice volume, mainly due to their large deviation of the area-weighted average year of digitization to the year 2000 CE, as well as the large glacier ice volume changes during this corresponding period. At the global scale, glacier ice volume estimates would exhibit an underestimation of ca. 1300 km³ due to misdated glacier outlines, which is only ca. 0.8% of the total glacier ice volume estimate by Farinotti et al. (2019). Glacier ice volumes are only overestimated in RGI regions 12 (Caucasus and Middle East) and 18 (New Zealand). Percentage-wise, the highest deviation relative to the total glacier ice volume in a certain region occurs in RGI region 10 (North Asia ~ in the order of 8%). At the global scale, the above obtained deviation related to the misdated glacier outlines from RGIv6.0 is thus relatively low when compared to the total global glacier ice volume estimate from Farinotti et al. (2019), implying that the dataset is well-suited to estimate current global glacier ice volumes for the year 2000 CE. The corresponding systematic error is, however, glacier- and region-dependent.

In a next step, we can estimate the potential over or underestimation of the glacier area relative to the year 2000 CE (\( dA_{\text{region}}^{2000,\overline{year}_w}\)). With the volume-area scaling formula, we therefore need to find the best-fit values for \(c\) and \({\gamma}\) using glacier surface areas for each region from RGIv6.0 data (\(A_{\text{region}}\)) and the corresponding regional total volume estimates (\(V_{\text{region}}\)) from Farinotti et al. (2019):

../../_images/9b7a35a34da52994dfd7363fe55534447a44a59e303eec9d6c56a69a6268611f.png

With now also the values for \(c\) and \(\gamma\) known, we can calculate the expected glacier area under/overestimation in each RGI region from reverting the volume changes using the area-volume scaling method. Hence, based on these observed volume changes (\( dV_{\text{region}}^{2000,\overline{year}_w}\)) between 2000 CE and the area-weighted mean year of digitization (\(\overline{year}_w\)), as just derived above, we project glacier areas back to the year 2000 CE using the area-volume scaling formula discussed above:

\( dA_{\text{region}}^{2000,\overline{year}_w} \) [km²] \(= \dfrac{dV_{\text{region}}^{2000,\overline{year}_w}}{c{\gamma}A_{\text{region}}^{{\gamma}-1}}\)

or expressed as a relative area change (compared to the total glacier area in the RGI region):

\( dA_{\text{region}}^{2000,\overline{year}_w} \) [%] = \(100*\dfrac{dA_{{\text{region}}}^{2000,\overline{year}_w}}{A_{\text{region}}}\)

This results in:

The glacier surface area in RGI region  1 for the year 2000 CE is underestimated by 2134.21 km² or 2.46% of its total glacier area.
The glacier surface area in RGI region  2 for the year 2000 CE is underestimated by 148.87 km² or 1.02% of its total glacier area.
The glacier surface area in RGI region  3 for the year 2000 CE is underestimated by 48.33 km² or 0.05% of its total glacier area.
The glacier surface area in RGI region  4 for the year 2000 CE is underestimated by 141.92 km² or 0.35% of its total glacier area.
The glacier surface area in RGI region  5 for the year 2000 CE is underestimated by 161.09 km² or 0.12% of its total glacier area.
The glacier surface area in RGI region  6 for the year 2000 CE is underestimated by 0.57 km² or 0.01% of its total glacier area.
The glacier surface area in RGI region  7 for the year 2000 CE is underestimated by 459.11 km² or 1.35% of its total glacier area.
The glacier surface area in RGI region  8 for the year 2000 CE is underestimated by 113.06 km² or 3.83% of its total glacier area.
The glacier surface area in RGI region  9 for the year 2000 CE is underestimated by 320.93 km² or 0.62% of its total glacier area.
The glacier surface area in RGI region 10 for the year 2000 CE is underestimated by 111.95 km² or 4.65% of its total glacier area.
The glacier surface area in RGI region 11 for the year 2000 CE is underestimated by 51.54 km² or 2.46% of its total glacier area.
The glacier surface area in RGI region 12 for the year 2000 CE is  overestimated by 23.44 km² or 1.79% of its total glacier area.
The glacier surface area in RGI region 13 for the year 2000 CE is underestimated by 263.56 km² or 0.53% of its total glacier area.
The glacier surface area in RGI region 14 for the year 2000 CE is underestimated by 17.94 km² or 0.05% of its total glacier area.
The glacier surface area in RGI region 15 for the year 2000 CE is underestimated by 159.97 km² or 1.09% of its total glacier area.
The glacier surface area in RGI region 16 for the year 2000 CE is underestimated by 40.84 km² or 1.74% of its total glacier area.
The glacier surface area in RGI region 17 for the year 2000 CE is underestimated by 75.17 km² or 0.26% of its total glacier area.
The glacier surface area in RGI region 18 for the year 2000 CE is  overestimated by 42.74 km² or 3.68% of its total glacier area.
The glacier surface area in RGI region 19 for the year 2000 CE is underestimated by 786.77 km² or 0.59% of its total glacier area.

The total glacier surface area in RGIv6.0 for the year 2000 CE is underestimated by 4969.64 km² or 0.67% of the total glacier area in the dataset.

Or when plotted:

../../_images/8a3187574791693e023ba9850fbf802aa5854117fef35d25a71e1f2c467ab61e.png

In the plots above, negative values indicate an underestimation. We conclude that, seen globally, there is an underestimation of the glacier surface area for the year 2000 CE in the RGIv6.0 dataset, which is to be expected by the values of the area-weighted average year of digitization that are generally more recent than the year 2000 CE. The global-scale underestimation is, however, relatively small compared to the total global surface area estimate (< 1%) and can hence be assumed to be of minor influence for applications at the global scale. Again, it is worth noting that the systematic error is, however, glacier- and region-dependent.

ℹ️ If you want to know more#

Key resources#

“Glaciers distribution data from the Randolph Glacier Inventory for year 2000” on the CDS
“Glacier mass change gridded data from 1976 to present derived from the Fluctuations of Glaciers Database” on the CDS
C3S EQC custom functions, c3s_eqc_automatic_quality_control prepared by BOpen.

References#

Bahr, D. B., Meier, M. F., and Peckham, S. (1997). The physical basis of glacier volume-area scaling. Journal of Geophysical Research, 102(B9), 20355–20362, doi: 10.1029/97JB01696.
Farinotti, D., Huss, M., Fürst, J. J., Landmann, J., Machguth, H., Maussion, F., and Pandit, A. (2019). A consensus estimate for the ice thickness distribution of all glaciers on Earth. Nature Geoscience, 12(3), 168-173. doi: 10.1038/s41561-019-0300-3.
Hock, R., Maussion, F., Marzeion, B., and Nowicki, S. (2022). What is the global glacier ice volume outside the ice sheets? Journal of Glaciology, 69(273), 204–10. doi: 10.1017/jog.2023.1.
Huss, M., and Farinotti, D. (2012). Distributed ice thickness and volume of all glaciers around the globe, Journal of Geophysical Research, 117, F04010. doi: 10.1029/2012JF002523.
Huss, M. (2013). Density assumptions for converting geodetic glacier volume change to mass change, The Cryosphere, 7, 877–887, doi: 10.5194/tc-7-877-2013.
Leclercq, P. W., Oerlemans, J., Basagic, H. J., Bushueva, I., Cook, A. J., and Le Bris, R. (2014). A data set of worldwide glacier length fluctuations, Cryosphere, 2014(8), 659–672. doi: 10.5194/tc-8-659-2014.
Li, Y. J., Ding, Y. J., Shangguan, D. H., and Wang, R. J. (2019). Regional differences in global glacier retreat from 1980 to 2015, Advances in Climate Change Research, 10(4), 203–213. doi: 10.1016/j.accre.2020.03.003.
Li, Y. J., Li, F., Shangguan, D. H., Ding, Y. J. (2021). A New Global Gridded Glacier Dataset Based on the Randolph Glacier Inventory Version 6.0. Journal of Glaciology, 67 (2021), 773–76. doi: 10.1017/jog.2021.28.
Li, F., Maussion F., Wu, G., Chen, W., Yu, Z., Li, Y., and Liu, G. (2022). Influence of glacier inventories on ice thickness estimates and future glacier change projections in the Tian Shan range, Central Asia. Journal of Glaciology 1–15. doi: 10.1017/jog.2022.60.
Marzeion, B., Jarosch, A. H., and Hofer, M. (2012). Past and future sea-level change from the surface mass balance of glaciers, The Cryosphere, 6, 1295–1322. doi: 10.5194/tc-6-1295-2012.
Millan, R., Mouginot, J., Rabatel, A., and Morlighem, M. (2022). Ice velocity and thickness of the world’s glaciers, Nature Geoscience, 15, 124–129. doi: 10.1038/s41561-021-00885-z.
Paul, F., Barrand, N. E., Baumann, S., Berthier, E., Bolch, T., Casey, K., Frey, H., Joshi, S. P., Konovalov, V., Le Bris, R., Mölg, N., Nosenko, G., Nuth, C., Pope, A., Racoviteanu, A., Rastner, P., Raup, B., Scharrer, K., Steffen, S., and Winsvold, S. (2013). On the Accuracy of Glacier Outlines Derived from Remote-Sensing Data. Annals of Glaciology, 54(63), 171–82. doi: 10.3189/2013AoG63A296.
Pfeffer, W. T., Arendt, A. A., Bliss, A., Bolch, T., Cogley, J. G., Gardner, A. S., Hagen, J. O., Hock, R., Kaser, G., Kienholz, C., Miles, E. S., Moholdt, G., Mölg, N., Paul, F., Radić, V., Rastner, P., Raup, B. H., Rich, J., Sharp, M. J., and Glasser, N. (2014). The Randolph Glacier Inventory: A globally complete inventory of glaciers, Journal of Glaciology, 60(221), 537-552. doi: 10.3189/2014JoG13J176.
RGI Consortium (2017). Randolph Glacier Inventory – A Dataset of Global Glacier Outlines: Version 6.0: Technical Report, Global Land Ice Measurements from Space, Colorado, USA. Digital Media. doi: 10.7265/N5-RGI-60.
Zemp, M., Frey, H., Gärtnew-Roer, I., Nussbaumer, S. U., Helzle, M., Paul, F., Haeberli, W., Denzinger, F., Ahlstrøm, A. P., Anderson, B., Bajracharya, S., Baroni, C., Braun, L. N., Cáceres, B. E., Casassa, G., Cobos, G., Dávila, L. R., Delgado Granados, H., Demuth, M. N., Espizua, L., Fischer, A., Fujita, K., Gadek, B., Ghazanfar, A., Hagen, J. O., Holmlund, P., Karimi, N., Li, Z., Pelto, M., Pitte, P., Popovnin, V. V., Portocarrero, C. A., Prinz, R., Sangewar, C. V., Severskiy, I., Sigurdsson, O., Soruco, A., Usubaliev, R., and Vincent, C. (2015). Historically unprecedented global glacier decline in the early 21st century, Journal of Glaciology, 61, 745-762. doi: 10.3189/2015JoG15J017.
Zemp, M., Huss, M., Thibert, E., Eckert, N., McNabb, R., Huber, J., Barandun, M., Machguth, H., Nussbaumer, S. U., Gärtner-Roer, I., Thomson, L., Paul, F., Maussion, F., Kutuzov, S., and Cogley, J. G. (2019). Global glacier mass changes and their contributions to sea-level rise from 1961 to 2016. Nature, 568, 382–386. doi: 10.1038/s41586-019-1071-0.

Quality assessment of glacier extent data from satellite observations: bias, reliability, and accuracy for glaciological and hydrological applications

Contents

1.3.2. Quality assessment of glacier extent data from satellite observations: bias, reliability, and accuracy for glaciological and hydrological applications#

🌍 Use case: Using digitized glacier outlines and glacier extent data around the year 2000 CE for the estimation of the corresponding global glacier ice volumes and their sea level equivalent#

❓ Quality assessment question#

📢 Quality assessment statement#

📋 Methodology#

Short description#

Structure and (sub)sections#

📈 Analysis and results#

⏬ Data preparation and processing#

📊 The (area-weighted) mean year of digitization of glacier data#

🚨 Other potential sources of error and uncertainty#

✅ How well can the data be used to estimate current global glacier ice volumes?#

📈 An attempt to quantify potential glacier volume and surface area over or underestimation due to misdated outlines#

ℹ️ If you want to know more#

Key resources#

References#