
from xgriddedaxis import Remapper
from xgriddedaxis.testing import create_dataset
import xarray as xr
Input Data

For demonstration purposes, we are going to use the create_dataset() function for generating test data.

ds = create_dataset(start='2000-01-01', end='2002-01-01', freq='D', nlats=90, nlons=180)
Our input data set consists of two variables tmin, and tmax plus the time_bounds variable. The data was generated at a daily frequency for two years.

m = ds.mean(dim=['lat', 'lon'])
Remapper Object

Say we want to downsample the input data from daily to monthly frequency. To achieve this, we create a remapper object, and pass in:

  • An xarray Dataset containing the time, time boundary information of the incoming time axis.
  • An outgoing frequency. For e.g ‘M’, ‘2D’, ‘H’, or ‘3T’ For full specification of available frequencies, please see here
remapper = Remapper(ds, freq='M')
During the Remapper object creation, xgriddedaxis uses the incoming time axis information in conjunction with the specified frequency to construct an outgoing time axis information. This information is stored as an xarray Dataset in the .info attribute of the remapper object:

The remapper is telling us that it can remap data from a daily time frequency with 731 incoming timesteps (731 days) to monthly time frequency with 24 outgoing timesteps (24 months).

The remapping weights are stored as a sparse matrix (following the Coordinate List (COO) layout) in the weights variable:

The outgoing time bounds are stored in the outgoing_time_bounds variable:

More information about the incoming and outgoing time axes is stored in the attrs section:

Performing remapping (resampling)

Now that we have an instance of the Remapper object, we can tell xgriddedaxis to convert data from the incoming time axis to the outgoing (destination) axis.

tmin_out = remapper.average(ds.tmin)
Check results

Check broadcasting over extra dimensions

The remapping should affect the time dimension only. We can check that xgriddedaxis tracks coordinate values over extra dimensions

# Passes if the output is exactly the same as the input
xr.testing.assert_identical(ds.lat, tmin_out.lat)
xr.testing.assert_identical(ds.lon, tmin_out.lon)
We can plot the time series at a specific location, to make sure the broadcasting is correct:

ds.tmin.sel(lat=-90, lon=-180).plot()
tmin_out.sel(lat=-90, lon=-180).plot(color='red')
%load_ext watermark
%watermark -v -m -g -p xarray,xgriddedaxis,cftime,pandas
CPython 3.7.3
IPython 7.13.0

xarray 0.15.1
xgriddedaxis 0.0.post43
cftime 1.1.2
pandas 1.0.3

compiler   : GCC 7.4.0
system     : Linux
release    : 5.0.0-1032-azure
machine    : x86_64
processor  : x86_64
CPU cores  : 1
interpreter: 64bit
Git hash   : b4e6bd9215269111a1dd602831c9d6ecec6f768d