Greenland-wide inventory of ice marginal lakes published in Scientific Reports

This month we published findings from our work on Greenland’s ice marginal lakes. This work are part of the ESA Glaciers CCI (Climate Change Initiative), focused on producing a Greenland-wide inventory of ice marginal lakes as a benchmark dataset; also referred to as an Essential Climate Variable. The publication is the culmination of a lot of dedicated work from individuals in the project, pooling together specialist knowledge on classifying ice marginal lakes through various remote sensing approaches.

Ice marginal lakes form at the terrestrial margins of the Greenland Ice Sheet, ice caps, and mountain glaciers, where the outflow is dammed or restricted. Ice marginal lakes can buffer the contribution of glacial melt to the sea level budget, forming a significant store of meltwater. These lakes are also understood to burst due to dam failure and breaching, causing catastropic releases of water to the ocean known as GLOFs (Glacial Lake Outburst Floods). Ice marginal lake dynamics are generally ignored in current predictions of future sea level rise, where glacial runoff assumed to contribute directly to sea level once it leaves the glacial system.

A false colour satellite image of Kangaarssuup Tasersua in South-West Greenland. Kangaarssuup Tasersua is one of the largest ice-marginal lakes of the south-west region of the ice sheet margin, as mapped subsequently (‘KT’ in Figure 2 from How et al., 2021). Melt runoff from the outlet glacier to the east drains into the basin, but outflow is damed by the glacier in the west; forming the ice marginal lake. Source: SnapPlanet

We assessed and evaluated the significance of this terrestrial store of meltwater in Greenland, using three independent and established remote sensing classification techniques to classify ice marginal lakes over the entirety of the ice margin:

  1. Backscatter classification from Sentinel-1 SAR (Synthetic Aperture Radar) satellite imagery
  2. Multi-spectral classification from Sentinel-2 optical satellite imagery
  3. Hydrological sink detection from the ArcticDEM (Digital Elevation Model)
Figure 1 from How et al. (2021) showing the overview of the 2017 ice marginal lake inventory of Greenland. Each point represents a lake – blue points indicate a lake adjacent to the ice sheet margin, and orange points are lakes that share a margin with either an ice cap or mountain glacier.

The inventory revealed 3347 ice marginal lakes (above 0.05 km2) across Greenland in 2017, with the highest number of lakes adjacent to the southwest ice sheet margin and Greenland’s ice caps and mountain glaciers.

Comparison to previous work by Carrivick and Quincey (2014) suggests a 75% increase in the number of lakes along the west margin of the ice sheet over the past three decades. This trend is largely related to marked variations in the presence of smaller lakes (i.e. 0.05-0.15 km2), with future lake formation likely to be concentrated in regions where the terrestrial ice margin length will increase (e.g. where marine-terminating outlets retreat onto land).

Figure 2 from How et al. (2021) showing ice marginal lakes over a selected region of the southwestern sector of the ice sheet margin. Panel A shows lake area, B shows lake shape, and C shows the detection method. The largest lake of this region is Kangaarssuup Tasersua, which is labelled ‘KT’ in Panel B.

By evaluating the performance of the three methodologies used to derived the inventory, we uncovered that detecting ice marginal lakes is especially challenging in Greenland given its large latitudinal range. Water classification studies relying on a single detection method over large regions could be at risk to under-representation.

A false colour satellite image of Inderhytten lake in North-East Greenland. Inderhytten is the second largest lake in our ice marginal lake inventory, at 112 km2. This was a particuarly interesting lake as its classification from satellite imagery is somewhat tricky – it is ice-covered throughout the majority of the year and its near-fjord margin lies at a low elevation. Source: SnapPlanet

As we were writing the paper, a really nice complementary study came out by Shugar et al., (2020) presenting a new global glacial lake inventory that included Greenland. This global inventory was derived solely from optical satellite imagery, which we found only captured 44% of the lakes present in our Greenland inventory. This reflects how multi-sensor and multi-method classification are time-consuming and require powerful processing capabilities that are not feasible in global studies at present; highlighting the need to continue this research and incorporate new and innovative approaches to classifying ice marginal lakes.


Our publication is available to read (with open access) through Scientific Reports:

How, P., Messerli, A., Matzler, E., Santoro, M., Wiesmann, A., Caduff, R., Langley, K., Bojesen, M.H., Paul, F., Kaab, A. and Carrivick, J.L. Greenland-wide inventory of ice marginal lakes using a multi-method approach. Scientific Reports. doi:10.1038/s41598-021-83509-1

And the inventory can be freely accessed and downloaded through the CEDA archive.

Six months working in Greenland

Since starting work in Greenland, we have had a handful of interesting projects which started before I arrived . Thankfully these projects have been related to the cryosphere, so I have been able to throw myself into them. As I was hired primarily for my knowledge of programming, my main tasks in these projects thus far have been automating and running batch processing for big satellite data.

Firstly, we have been finalising an ESA project looking at ice-marginal lakes in Greenland, creating an inventory for the whole of Greenland as well as generating time-series datasets for a select few ice-marginal lakes in SW Greenland. A remote sensing approach is used to identify these lakes, using three different methods for validation from Sentinel-1 (radar) imagery, Sentinel-2 (optical) imagery, and ArcticDEM (3D elevation model).

Unnamed IML in Greenland

The second largest ice-marginal lake in Greenland (unnamed), which lies on the SW margin of the ice sheet. The lake was 88.5 sq km in 2017, fed from various glacial outlets. The largest ice-marginal lake in Greenland is located on the NE margin, called Romer Sø, which is where the well-known piedmont glacier, Elephant Foot Glacier, flows into.

I came into the ice-marginal lakes project just as the pan-Greenland dataset was being refined, so could assist with the data quality check and metadata generation. I also took over the optical image processing scripts, modifying them for generating the time-series dataset. Re-engineering scripts so that big data processing  can be facilitated at the press of a button has been a highlight for me – it’s a achievement when you know you have processes that are slick, efficient and require little effort to produce valuable datasets.

The second project I have been working on is another ESA project, looking at detecting supraglacial lakes from optical satellite imagery. Unlike the former project where we were delivering a dataset, the supraglacial lake component was to serve as a research/exploration part of the project – we were to explore and analyse the dataset as well as produce it.

Supraglacial lakes in the Sermeq Kujalleq catchment

Supraglacial lakes in the Sermeq Kujalleq catchment (also known by its Danish name Jakobshavn Isbræ), near Ilulissat, West Greenland. This particular scene was captured in June 2019, close to the beginning of the melt season when supraglacial lakes in the lower catchment fill and drain. Lakes tend to fill and drain in the upper catchment later on in the melt season, with this pattern comonly known as a upglacier-propagating drainage regime.

I came on to this project just as the kick-off meeting happened, so was thrown into the processing scripts to write from scratch. Whilst I wouldn’t say I had a strong background in remote sensing before coming to Asiaq, I have found that most of the scripting felt similar to those I made during my PhD for processing terrestrial time-lapse imagery (a Python toolset called PyTrx). They felt essentially rooted in the same theory and therefore, although the task felt daunting at first, it was a good feeling to see something familiar.

On top of all this, I have been trying to streamline our data management. This is part of a longer-term strategy to optimise common routines and save time in projects. I have been working on making batch download scripts for some of the main satellite platforms – namely Sentinel-1, Sentinel-2, Landsat, MODIS and ArcticDEM. Whilst these tasks might seem straightforward, a lot of time has gone into data file structures and Asiaq-wide storage solutions. Equally, limitations with particular databases has meant  finding alternative databases for downloading from – a key example is the Long Term Archive (LTA) for Sentinel-2 imagery, and finding alternative databases to Copernicus SciHub for older Sentinel-2 imagery.

So, all in all, it has been quite a programming-heavy time since I started work in Greenland. I still feel lost and overwhelmed with the move to Greenland, but we have some really interesting projects that I’ve found really gel well with the skills that I have. We have submitted a handful of proposals and hope to hear some good news from them soon. I’m also looking at fleshing out some project ideas I have, which will likely go into future proposals. It’s been a very intense time, but the opportunity to live and work in Greenland far outweighs it all. 

PyTrx published in Frontiers in Earth Science

Frontiers in Earth Science recently published our work on PyTrx, the Python toolset developed during my PhD for processing oblique time-lapse imagery of glacial environments. The toolset is freely available via pip install and GitHub, and this paper serves as its companion piece to inform users of its capabilities and applications.

Ice velocities from Kronebreen

Figure 5 from our Frontiers publication, demonstrating PyTrx’s dense feature-tracking and georectification capabilities. Velocities were determined from oblique time-lapse image pairs between 14 June and 7 July 2014 at Kronebreen, a tidewater glacier in Svalbard. Templates (represented here as points) were defined over a 50×50 m grid, which were matched between image pairs using normalised cross-correlation and filtered by correlation (i.e. templates were retained where the correlation of the match was above 0.8). The sequence shows an early season speed-up at the terminus of the glacier, where velocities increase from an average of 2.5 m/day (14-16 June, first panel) to 4.7 m/day (5-7 July, last panel).

PyTrx came about as I wanted to derive measurements from time-lapse imagery, which I had collected from Kronebreen and Tunabreen, two tidewater glaciers in Svalbard. However, I couldn’t find any openly available toolset that met my needs. There are a handful of toolsets for processing ice velocities from time-lapse imagery (see ImGRAFT, Pointcatcher and EMT – three great examples), but I wanted to also process other types of measurements, such as meltwater plume footprint extents, supraglacial lake areas, and changes in terminus position. Additionally, most toolsets that I came across were programmed in a limited range of programming languages, mainly Matlab, and I felt there was a need for a toolset in an open-source programming language for those who wanted an alternative that did not rely upon a licensed software.

We set about making PyTrx just for our own processing needs at first, programmed in Python and largely utilising OpenCV, a Python package that handles complex computer vision operations on optical imagery. Before long, we realised there was a need for this toolset in the public domain, with growing interest from others; hence why we began focusing on finalising PyTrx as an operational package that anyone could use.

Delineating meltwater plume footprints using PyTrx

Figure 8 from our Frontiers publication, showing changes in meltwater plume extent distinguished from time-lapse imagery of Kronebreen using PyTrx. The surface expression of the meltwater plume has been tracked through images captured on 5 July 2014 at 18:00 (A), 20:00 (B), and 22:00 (C) to demonstrate part of its diurnal recession. Each plot shows the plume definition in the image plane (top) and its translation to real-world coordinates and plotted onto a coinciding Landsat 8 satellite image (bottom).

PyTrx has been developed with object-oriented design, meaning that the core functions are wrapped in callable-objects, which make it accessible to beginners in programming whilst also serving those more experienced. The following main functions can be performed with the toolset: dense template matching and sparse Optical Flow methods for deriving velocities, automated detection of area features (such as supraglacial lakes), manual delineation of area and line features (e.g. meltwater plume footprints, terminus position), and camera calibration and optimisation for refining the georectification of measurements from the images.

There were many stumbling blocks when it came to publishing PyTrx. I struggled as the feedback, although positive, was a large undertaking that made me question PyTrx’s worth and I doubted my own capabilities in delivering a sound toolset to the glaciology community. Overall though, the review process brought about big changes to PyTrx, which were absolutely essential to improving and finalising the toolset. I have a lot to thank for the review process, without which the toolset would not have reached its full potential and be what it is today.

Terminus lines derived with PyTrx

Figure 9 from our Frontiers publication, demonstrating PyTrx’s ability in extracting terminus profiles from Tunabreen, a tidewater glaciers in Svalbard. Terminus lines were manually traced from sequential time-lapse images, and subsequently georectified to provide a record of terminus retreat.


Links

PyTrx publication in Frontiers – our paper, describing the toolset and its applications using time-lapse imagery of tidewater glaciers in Svalbard

PyTrx GitHub repository – where PyTrx can be downloaded from (the master branch is where the raw scripts are, whilst the distribution branch holds the package files and readthedocs materials

PyTrx readthedocs – PyTrx guide and code documentation

PyTrx on PyPI – PyTrx’s package distribution via pip

Making a PyPI package

Recently, I have had a paper accepted which presents PyTrx, a new Python toolset for use in glacial photogrammetry. Over the course of getting this published, it has been suggested by co-authors and reviewers alike to use a package manager for easy download and implementation of PyTrx. I therefore wanted to package the toolset up for distribution via PyPI (‘pip’), thus making is easily accessible to other Python users with the simple command pip install pytrx. Whilst I found the tutorials online informative, there were some pitfalls which I found hard to solve with the given information. So here is an account of how I got my package on PyPI. The associated files for the PyTrx package are available on a branch of PyTrx’s GitHub repository, if you want to see this walkthrough in action.

Defining the package files

First and foremost, the file structure of the toolset is crucial to it being packaged up correctly. The top directory should contain a folder containing your package, and several other files containing the necessary setup information:

 master_folder
   - PyTrx
   - LICENSE.txt
   - README.md
   - setup.py 

This is one of the first slip-ups I made, putting all my toolset scripts in the tol directory rather than a folder of its own. If the Python scripts that make your package are not placed in their own folder then they will not be found when it comes to compiling the package.

So let’s go through each of these elements, beginning with the folder that contains the Python scripts we wish to turn into a PyPI package. An initialisation file needs to be created in this folder in order to import the directory as a package. This is simply an empty Python script called __init__.py, so our folder structure will look a bit like this now:

 master_folder
   - PyTrx
       - __init__.py
   - LICENSE.txt
   - README.md
   - setup.py 

Moving on to the LICENSE.txt file, it is important to define a license with any Python package that is publicly distributed in order to inform the user how your package can be used. This can simply be a text file containing a copied license. A straightforward and popular license for distributing code is the MIT license which allows code to be used and adapted with appropriate credit, but there are greate guides for choosing a license appropriate for you online (e.g. choosealicense.com). This file has to be called ‘license’ or ‘licence’ (uppercase or lowercase) so that it is recognised when it comes to compiling the package.

Similarly with the README.md file, this has to be called ‘readme’ specifically so that it is recognised when it comes to compiling the package. This file contains a long description of the Python package. It might be the case that you already have a README.md file if you have hosted your scripts on GitHub, in which case you can merely adopt this as your readme. Just remember that this should be hardcoded in HTML code, and the readme file will form the main description of your package that people will read when they navigate to the package’s PyPI webpage.

And finally the setup.py. The setup file is probably the trickiest file to define, but the most important as here we outline all of the metadata associated with our Python package; including the package’s recognised pip name (i.e. the one used in the command pip install NAME), its version, author and contact details, keywords, short package description, and dependencies. Here is PyTrx’s setup.py file to serve as an example:

import setuptools

with open("README.md", "r") as fh:
    long_description = fh.read()

setuptools.setup(
    name="pytrx", 
    version="1.1.0",
    author="Penelope How",
    author_email="pennyruthhow@gmail.com",
    description="An object-oriented toolset for calculating velocities, surface areas and distances from oblique imagery of glacial environments",
    long_description=long_description,
    long_description_content_type="text/markdown",
    url="https://github.com/PennyHow/PyTrx",
    keywords="glaciology photogrammetry time-lapse",
    packages=setuptools.find_packages(),
    classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Development Status :: 5 - Production/Stable",
        "Intended Audience :: Science/Research",
        "Natural Language :: English",
        "Operating System :: OS Independent",
    ],
    install_requires=['glob2', 'matplotlib', 'numpy', 'opencv-python>=3', 'pillow', 'scipy'],
    python_requires='>=3',
)

Most of the variables are straightforward to adapt for your own package setup.py file. The ones to watch out for are the classifiers variable where metadata flags are defined, and the install_requires variable where the package’s dependencies are. PyPI offers a good resource that lists all of the possible classifiers you can add to the classifiers variable.

Finding out how to define dependencies was a little trickier though, as the main PyPI tutorial does not address this. This page gave a brief outline of how to define them with the install_requires variable, but I found that I still had problems in the subsequent steps with package incompatibilities. My main problem was that I had largely worked with conda rather than pip for managing my Python packages, so there were a number of discrepancies between the two in configuring dependencies with PyTrx. My main challenge was finding a balance with OpenCV and GDAL, two notoriously difficult packages to find compatible versions for – I had managed this with conda, finding two specific versions of these packages to configure a working environment. In pip, I found this proved much harder. The package versions used in conda were not the same for pip, and there wasn’t an official repository for OpenCV, only an unofficial repository called opencv-python. We’ll learn more about testing dependency set-ups a bit later on, but for now, just be aware to check that each PyPI package dependency is available and use the >= or <= to define if the package needs to be above or below a certain version. It is generally advised not to pin a dependency to a specific version (i.e. ==), I guess because it reduces the flexibility of the package installation for users.

Generating the distribution files

Once we have all of our files, we can now compile our package and generate the distribution files that will be eventually uploading to TestPyPI and PyPI. It is advised to use TestPyPI to test your package distribution before doing the real deal on PyPI, and I found it incredibly useful as an apprehensive first-time uploader.

If you do decide to test your package on TestPyPI, it is good etiquette to change the name of your package (defined in setup.py) to something very unique – there are many test packages on TestPyPI, and although they delete test packages on a regular basis, there are plenty of package names that yours could clash with. In the case of PyTrx, I defined the package name as pytrxhow (the package name with my surname), that way there was no chance of using a name that had already been taken. Additionally, you should take your dependencies out of the setup.py file as often the same packages do not exist on TestPyPI and therefore are not an accurate reflection of how your package dependencies will look on PyPI.

To generate the distribution files, two packages need to be installed into your Python environment, setup-tools and wheel. I already had versions of these packages in my conda environment, but I updated them using the same command (in Anaconda Prompt) as if I wanted to install them:

conda install setup-tools wheel

After these are installed, navigate to the directory where all of your files are (i.e. in master_folder) using the cd command, and run the following command to build your distribution files for TestPyPI:

python3 setup.py sdist bdist_wheel

This should generate two folders containing files that look something like this:

master_folder
   - PyTrx
       - __init__.py
   - LICENSE.txt
   - README.md
   - dist
       - pytrx-1.1.0-py3-none-any.whl
       - pytrx-1.1.0.tar.gz
   - pytrx.egg-info
       - PKG-INFO
       - SOURCES.txt	
       - dependency_links.txt	
       - requires.txt	
       - top_level.txt
   - setup.py 

The dist and egg-info folder should contain all of the information inputted into the setup.py file, so it’s a good idea to check through these to see if the files are populated correctly. The SOURCES.txt file should contain a list of the paths to all of the relevant files for making your packages. If you have taken out your dependencies, then the requires.txt file should be empty.

Testing the distribution

There are two ways to test that the distribution files work: 1. using TestPyPI to trial the distribution and the ‘look’ of the PyPI entry, and 2. using the setup.py file to test the package installation in your local environment (including dependency solving). Beginning with the test on TestPyPI, start by creating an account on TestPyPI and creating an API token, so you can securely upload the package (there is a great set of instructions for doing this here). Make sure to write down all of the information associated with the token as you will not be able to see it again.

Next, make sure that you have an up-to-date version of the twine package in your environment. Twine is a Python package primarily for uploading packages, which can easily installed/upgraded in a conda environment with the following command:

conda install twine

Now, Twine can be used to facilitate the upload of your package to TestPyPI with this command (making sure that you are still in your master_folder directory:

python3 -m twine upload --repository-url https://test.pypi.org/legacy/ dist/*

Once the command has run, there will be a link to your TestPyPI repository at the bottom which you can click on to take you to it. You can use this to test install your package with no dependencies. In the case of PyTrx (pytrxhow, my test version), this could be done with the following command (just change ‘pytrxhow’ to specify a different package):

pip install -i https://test.pypi.org/simple/ pytrxhow 

This is all well and good for testing how a package looks on PyPI and testing it can install, however, I was more anxious about the package dependencies knowing the issues I had with OpenCV and GDAL previously in my conda environment. After checking your TestPyPI installation (and this may take a few tries, updating the version number every time), put your dependencies back into your setup.py file, run the distribution file generation again, and test the dependency configuration with the following command that will attempt to install your package locally:

python setup.py develop

This may take some time to run, but should give you an idea as to whether the dependencies can be resolved. I cloned my base conda environment in order to do this, giving a (relatively) blank environment to run off, and tested the installation by attempting to import the newly installed package in Spyder.

I found that I could not solve the environment, no matter what I specified in setup.py, and therefore had to play around with which package was causing the majority of the problems. I found that GDAL was the main cause of PyTrx unsuccessfully installing, so took it out of my dependencies, instead opting to install it after with conda. This seems to work much better, and although may not be a perfect solution, it will create fewer problems for users.

Uploading the distribution to PyPI

So at this point you should feel confident in the look and feel of your package, and its installation in your environment. Before proceeding with the final steps, just run through the following checklist to make sure you have everything:

  • Check that all the information is correct in the setup.py, changing the name (e.g. ‘pytrxhow’ to ‘pytrx’) and dependencies if you have been uploading to PyPI previously
  • If you change anything in the setup.py file, then run the distribution file generation again
  • Check your TestPyPI page to make sure all the information uploaded is correct and nothing is missing
  • Check on PyPI that there is no other package with the same name as yours

A thorough check is needed at this stage because an upload to PyPI cannot be changed. Further package versions can be uploaded if there is a major problem, but versions that you have uploaded cannot be edited or altered. Therefore it is best to try and get it right the first time. No pressure!

For uploading to PyPI, you need to create an account on PyPI. This account creation is separate to TestPyPI, so another username and password unfortunately. Again, create an API token in the same manner as done previously with TestPyPI, making sure to write down all of the details associated with it. To upload your package to PyPI, we are using Twine again and the following command:

twine upload dist/*

Once run, there will be a link to click through to your PyPI page and voila, your package is online and easy for anyone to download with the old classic command (in the case of PyTrx):

pip install pytrx

In the case of PyTrx, our PyPI page is available to view here, and our GitHub repository contains all of PyTrx’s scripts and the distribution files used for the PyPI upload, which might be useful for some. Hope this helps someone who has suffered any of the pitfalls of PyPI packages! 

Icebergs in Nuuk


Useful resources:

A broad overview and use of Test PyPI

Uploading to PyPI

More information about specifying dependencies and testing package installations

More information about PyPI classifiers

PyTrx’s PyPI page, GitHub repository, and publication

Moving to Greenland

It’s been a while since I used this platform to post an update. A lot has changed in the past year. For one, I now live in Nuuk, the capital city of Greenland. Greenland (or Kalaallit Nunaat in Greenlandic) is a self-governing region within the Kingdom of Denmark, having been granted self-government in 2009. It has the lowest population density in the world*, with just ~17,000 people living in the capital city of Nuuk.

Nuuk is a weird and wonderful place to live in. I have already built a strong repertoire of funny anecdotes from trying to navigate my first month of living here – from being asked ‘do you have any meat or fish?’ before getting in a taxi, to watching icebergs float past my house in the fjord. It was a big step to make this move, and so far it’s been absolutely worth it.

I moved here to take up a permanent position at Asiaq Greenland Survey as a remote sensing specialist. It’s a small shift from academia, but we still conduct research and write scientific papers just like everyone else. The pace of work is fast here and I am still settling in and figuring everything out, but it is the refreshing change I needed after finishing my PhD. We are currently working on a handful of ESA projects which I have fallen into nicely, and writing proposals for future projects.

Overall, I hope I can kickstart writing these updates again. They were fun to do during my PhD, and it was a pity I lost that in the lead up to handing in my thesis and during my first postdoc. I hope I can report on projects I work on in Asiaq, and talk about life in Greenland generally. Here’s to this platform’s rejuvenation!

Out and about around Nuuk

This was from a hiking/fishing trip just outside of Nuuk. Fishing here is very easy if you like cod. They say here that if you don’t get any fish after two or three casts then you should move on to the next spot! The fish we caught this day were massive!


*Statistics Greenland have unbelievably thorough analytics on Greenland (e.g. their 2019 report)