Conventions#
Different communities and file formats have different ways of encoding the metadata for discrete global grid systems.
The built-in format (called the "xdggs" convention) is to put all metadata on the coordinate containing the cell indices. When decoding, this metadata is used to construct a in-memory xarray index and removed from the coordinate:
import xdggs
import xarray as xr
xdggs_encoded = xdggs.tutorial.open_dataset("air_temperature", "h3").load()
display(xdggs_encoded)
decoded = xdggs_encoded.dggs.decode()
display(decoded)
<xarray.Dataset> Size: 16MB
Dimensions: (time: 2920, cells: 695)
Coordinates:
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
lon (cells) float64 6kB -78.06 -87.72 -122.4 ... -75.84 -92.88 -95.97
lat (cells) float64 6kB 71.56 72.04 72.36 71.65 ... 17.98 17.7 17.52
cell_ids (cells) uint64 6kB 585508633488392191 ... 587389348127703039
Dimensions without coordinates: cells
Data variables:
air (time, cells) float64 16MB 246.3 247.5 237.3 ... 300.3 299.2 295.6
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...<xarray.Dataset> Size: 16MB
Dimensions: (time: 2920, cells: 695)
Coordinates:
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
lon (cells) float64 6kB -78.06 -87.72 -122.4 ... -75.84 -92.88 -95.97
lat (cells) float64 6kB 71.56 72.04 72.36 71.65 ... 17.98 17.7 17.52
* cell_ids (cells) uint64 6kB 585508633488392191 ... 587389348127703039
Dimensions without coordinates: cells
Data variables:
air (time, cells) float64 16MB 246.3 247.5 237.3 ... 300.3 299.2 295.6
Indexes:
cell_ids H3Index(level=2)
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...Built-in conventions#
xdggs comes with support for a few external conventions:
"cf": the convention for Healpix added in version 1.13 of the CF conventions"zarr": the zarrdggsconvention
CF convention#
The included convention object supports a generalized version of the healpix grid mapping added in version 1.13 of the CF conventions. Thus, it applies the same encoding scheme to other DGGS like H3.
To use it, pass convention="cf" to the decode / encode methods:
cf_encoded = decoded.dggs.encode(convention="cf")
display(cf_encoded)
cf_decoded = cf_encoded.dggs.decode(convention="cf")
xr.testing.assert_identical(cf_decoded, decoded)
<xarray.Dataset> Size: 16MB
Dimensions: (time: 2920, cells: 695)
Coordinates:
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
lon (cells) float64 6kB -78.06 -87.72 -122.4 ... -75.84 -92.88 -95.97
lat (cells) float64 6kB 71.56 72.04 72.36 71.65 ... 17.98 17.7 17.52
cell_ids (cells) uint64 6kB 585508633488392191 ... 587389348127703039
crs int8 1B 0
Dimensions without coordinates: cells
Data variables:
air (time, cells) float64 16MB 246.3 247.5 237.3 ... 300.3 299.2 295.6
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...zarr convention#
The zarr dggs convention makes use of the nesting allowed by zarr to encode all grid metadata into a single metadata object in the attributes.
To use it, pass convention="zarr" to the decode / encode methods:
zarr_encoded = decoded.dggs.encode(convention="zarr")
display(zarr_encoded)
zarr_decoded = zarr_encoded.dggs.decode(convention="zarr")
xr.testing.assert_identical(zarr_decoded, decoded)
<xarray.Dataset> Size: 16MB
Dimensions: (time: 2920, cells: 695)
Coordinates:
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
lon (cells) float64 6kB -78.06 -87.72 -122.4 ... -75.84 -92.88 -95.97
lat (cells) float64 6kB 71.56 72.04 72.36 71.65 ... 17.98 17.7 17.52
cell_ids (cells) uint64 6kB 585508633488392191 ... 587389348127703039
Dimensions without coordinates: cells
Data variables:
air (time, cells) float64 16MB 246.3 247.5 237.3 ... 300.3 299.2 295.6
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). Th...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.re...
dggs: {'name': 'h3', 'refinement_level': 2, 'spatial_dimensi...
zarr_conventions: [{'uuid': '7b255807-140c-42ca-97f6-7a1cfecdbc38', 'sch...Registering a custom convention#
Conventions are defined as an object inheriting from Convention. It must define two methods:
decode()for decoding into the in-memory structureencode()for encoding the in-memory format to the given convention
For example:
import xdggs
from collections.abc import Hashable
from typing import Any
from xdggs.grid import DGGSInfo
@xdggs.conventions.register_convention("my-convention")
class MyConvention(xdggs.conventions.Convention):
def decode(
self,
obj: xr.Dataset,
*,
grid_info: dict[str, Any] | DGGSInfo | None,
name: Hashable | None,
index_options: dict[str, Any] | None,
) -> xr.Dataset:
# decode
pass
def encode(
self, obj: xr.Dataset, *, encoding: dict[str, Any] | None = None
) -> xr.Dataset:
# encode
pass