Glacier working directories

See also: GlacierDirectory

The majority of OGGM tasks are so-called “entity tasks”. They are standalone operations to be realized on one single glacier entity. These tasks are executed sequentially: they often need input generated by the previous task(s). In order to avoid complicated chains of arguments, each task will read the input data from a glacier-specific directory and writes its output into the same directory, making the new data available for further computations.

Initalising a glacier directory

If no directory has been created yet, a GlacierDirectory requires an RGI entity as input:

In [1]: base_dir = os.path.join(os.path.expanduser('~'), 'OGGM_docs', 'GlacierDir')

In [2]: entity = gpd.GeoDataFrame.from_file(get_demo_file('HEF_MajDivide.shp')).iloc[0]

In [3]: gdir = oggm.GlacierDirectory(entity, base_dir=base_dir)

In [4]: gdir.dir
Out[4]: '/home/docs/OGGM_docs/GlacierDir/RGI50-11/RGI50-11.00/RGI50-11.00897'

In [5]: gdir.rgi_id, gdir.rgi_area_km2
Out[5]: ('RGI50-11.00897', 6.247362925353041)

Note that this directory has just been created and is empty. The tasks.define_glacier_region() will fill it with the first data files:

In [6]: tasks.define_glacier_region(gdir, entity=entity)

In [7]: os.listdir(gdir.dir)
Out[7]: 
['intersects.cpg',
 'outlines.dbf',
 'outlines.cpg',
 'intersects.shx',
 'dem_source.pkl',
 'intersects.shp',
 'outlines.shx',
 'intersects.prj',
 'outlines.shp',
 'log.txt',
 'dem.tif',
 'glacier_grid.json',
 'intersects.dbf',
 'outlines.prj']

This persistence on disk allows for example to continue a workflow that has been previously interrupted. Initialising a GlacierDirectory from a non-empty folder won’t erase its content (you’ll have to set reset=True explicitly if you want that):

In [8]: gdir = oggm.GlacierDirectory(entity, base_dir=base_dir)

In [9]: os.listdir(gdir.dir)  # the directory still contains the data
Out[9]: 
['intersects.cpg',
 'outlines.dbf',
 'outlines.cpg',
 'intersects.shx',
 'dem_source.pkl',
 'intersects.shp',
 'outlines.shx',
 'intersects.prj',
 'outlines.shp',
 'log.txt',
 'dem.tif',
 'glacier_grid.json',
 'intersects.dbf',
 'outlines.prj']

You can also initialise a non-empty GlacierDirectory with its RGI ID, thus sparing the reading of the shapefile every time:

In [10]: gdir = oggm.GlacierDirectory('RGI50-11.00897', base_dir=base_dir)

cfg.BASENAMES

This is a list of the files that can be found in the glacier directory or its divides. These data files and their names are standardized and listed in the oggm.cfg module. If you want to implement your own task you’ll have to add an entry to this file too.

calving_output.pkl
Calving output
catchment_indices.pkl
A list of len n_centerlines, each element conaining a numpy array of the indices in the glacier grid which represent the centerline’s catchment area.
catchments_intersects.shp
The catchments intersections in the local projection.
centerlines.pkl
A list of Centerline instances, sorted by flow order.
cesm_data.nc
The monthly GCM climate timeseries for this glacier, stored in a netCDF file.
climate_info.pkl
Some information (dictionary) about the climate data for this glacier, avoiding many useless accesses to the netCDF file.
climate_monthly.nc
The monthly climate timeseries for this glacier, stored in a netCDF file.
dem.tif
A geotiff file containing the DEM (reprojected into the local grid).
dem_source.pkl
A string with the source of the topo file (ASTER, SRTM, …).
downstream_line.pkl
A dict containing the downsteam line geometry as well as the bedshape computed from a parabolic fit.
flowline_catchments.shp
The flowline catchments in the local projection.
geometries.pkl
A dict containing the shapely.Polygons of a glacier. The “polygon_hr” entry contains the geometry transformed to the local grid in (i, j) coordinates, while the “polygon_pix” entry contains the geometries transformed into the coarse grid (the i, j elements are integers). The “polygon_area” entry contains the area of the polygon as computed by Shapely.
glacier_grid.json
A salem.Grid handling the georeferencing of the local grid.
gridded_data.nc
A netcdf file containing several gridded data variables such as topography, the glacier masks and more (see the netCDF file metadata).
intersects.shp
The glacier intersects in the local projection.
inversion_flowlines.pkl
A “better” version of the Centerlines, now on a regular spacing i.e., not on the gridded (i, j) indices. The tails of the tributaries are cut out to make more realistic junctions. They are now “1.5D” i.e., with a width.
inversion_input.pkl
List of dicts containing the data needed for the inversion.
inversion_output.pkl
List of dicts containing the output data from the inversion.
inversion_params.pkl
Dict of fs and fd as computed by the inversion optimisation.
linear_mb_params.pkl
When using a linear mass-balance for the inversion, this dict stores the optimal ela_h and grad.
local_mustar.csv
A csv with three values: the local scalars mu*, t*, bias
model_diagnostics.nc
A netcdf file containing the model diagnostics (volume, mass-balance, length…).
model_flowlines.pkl
List of flowlines ready to be run by the model.
model_run.nc
A netcdf file containing enough information to reconstruct the entire flowline glacier along the run (can be data expensive).
mu_candidates.pkl
A pandas.Series with the (year, mu) data.
outlines.shp
The glacier outlines in the local projection.
prcp_fac_optim.pkl
A Dataframe containing the bias scores as a function of the prcp factor. This is useful for testing mostly.