Working with H3 data#
H3 is a popular icosahedral DGGS with hexagonal cells, developed and popularized by Uber. For more information, see https://h3geo.org. The tutorial aims to showcase how to work with H3 data using xdggs.
Import libraries#
import xarray as xr
import xdggs
_ = xr.set_options(display_expand_data=False)
Initialization#
To initialize, we first have to open the dataset. Here weβll use xarrayβs air_temperature tutorial dataset, which was interpolated to the H3 grid.
Tip
If the dataset you want to work on is not already on a H3 grid, you will have to use a different package to interpolate.
Warning
For the purpose of this tutorial we drop the geographic coordinates and load all data into memory, but this is not required.
original_ds = xdggs.tutorial.open_dataset("air_temperature", "h3").load()
air_temperature = original_ds.drop_vars(["lat", "lon"])
air_temperature
<xarray.Dataset> Size: 16MB
Dimensions: (time: 2920, cells: 695)
Coordinates:
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
cell_ids (cells) uint64 6kB 585508633488392191 ... 587389348127703039
Dimensions without coordinates: cells
Data variables:
air (time, cells) float64 16MB 246.3 247.5 237.3 ... 300.3 299.2 295.6
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...After that, we can use xdggs.decode() to tell xdggs to interpret the cell ids.
This will create a grid object (see xarray.Dataset.dggs.grid_info and xdggs.H3Info for more information) containing the grid parameters and a custom index for the cell_ids coordinate (notice how the coordinate name is displayed in bold), which will allow us to perform grid-aware operations.
Important
For this to work, the dataset has to have a coordinate called cell_ids, and it also has to have the grid_name and level attributes.
The grid_name refers to the short name of the grid, while level refers to the grid hierarchical level (the h3 libraries call this the βresolutionβ, while xdggs will use βlevelβ for all grids).
In this case, the attributes on cell_ids are:
{
"grid_name": "h3",
"level": 2,
}
ds = air_temperature.pipe(xdggs.decode)
ds
<xarray.Dataset> Size: 16MB
Dimensions: (time: 2920, cells: 695)
Coordinates:
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
* cell_ids (cells) uint64 6kB 585508633488392191 ... 587389348127703039
Dimensions without coordinates: cells
Data variables:
air (time, cells) float64 16MB 246.3 247.5 237.3 ... 300.3 299.2 295.6
Indexes:
cell_ids H3Index(level=2)
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...Deriving data#
With the grid object and the custom index, we can derive additional data from the cell ids.
Cell center coordinates#
For example, we can reconstruct the cell centers we dropped from the original dataset, using xarray.Dataset.dggs.cell_centers():
cell_centers = ds.dggs.cell_centers()
cell_centers
<xarray.Dataset> Size: 11kB
Dimensions: (cells: 695)
Coordinates:
latitude (cells) float64 6kB 71.56 72.04 72.36 71.65 ... 17.98 17.7 17.52
longitude (cells) float64 6kB -78.06 -87.72 -122.4 ... -75.84 -92.88 -95.97
Dimensions without coordinates: cells
Data variables:
*empty*These are the same as the ones we dropped before:
derived_ds = ds.assign_coords(
cell_centers.rename_vars({"latitude": "lat", "longitude": "lon"}).coords
)
derived_ds
<xarray.Dataset> Size: 16MB
Dimensions: (time: 2920, cells: 695)
Coordinates:
* time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
* cell_ids (cells) uint64 6kB 585508633488392191 ... 587389348127703039
lat (cells) float64 6kB 71.56 72.04 72.36 71.65 ... 17.98 17.7 17.52
lon (cells) float64 6kB -78.06 -87.72 -122.4 ... -75.84 -92.88 -95.97
Dimensions without coordinates: cells
Data variables:
air (time, cells) float64 16MB 246.3 247.5 237.3 ... 300.3 299.2 295.6
Indexes:
cell_ids H3Index(level=2)
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...xr.testing.assert_allclose(derived_ds, original_ds)
Cell boundary polygons#
Additionally, we can derive the cell boundary polygons as an array of Shapely using xarray.Dataset.dggs.cell_boundaries():
cell_boundaries = ds.dggs.cell_boundaries()
cell_boundaries
<xarray.DataArray (cells: 695)> Size: 6kB
POLYGON ((-76.58894477052608 73.22979877705764, -82.3342737067018 72.71199367...
Coordinates:
cell_ids (cells) uint64 6kB 585508633488392191 ... 587389348127703039
Dimensions without coordinates: cellsPlotting#
We can quickly visualize the data using xarray.DataArray.dggs.explore(), which is powered by lonboard.
Note
The slider requires a running kernel, so this wonβt work in static documentation.
ds["air"].dggs.explore()