Tutorial¶
[1]:
from xgriddedaxis import Remapper
from xgriddedaxis.testing import create_dataset
import xarray as xr
xr.set_options(display_style="html")
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-1-d8ee320b5cf4> in <module>
1 from xgriddedaxis import Remapper
----> 2 from xgriddedaxis.testing import create_dataset
3 import xarray as xr
4 xr.set_options(display_style="html")
ModuleNotFoundError: No module named 'xgriddedaxis.testing'
Input Data¶
For demonstration purposes, we are going to use the create_dataset()
function for generating test data.
[2]:
ds = create_dataset(start='2000-01-01', end='2002-01-01', freq='D', nlats=90, nlons=180)
ds
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-2-07ad2c7318d6> in <module>
----> 1 ds = create_dataset(start='2000-01-01', end='2002-01-01', freq='D', nlats=90, nlons=180)
2 ds
NameError: name 'create_dataset' is not defined
Our input data set consists of two variables tmin
, and tmax
plus the time_bounds
variable. The data was generated at a daily frequency for two years.
[3]:
ds.tmin.isel(time=0).plot(robust=True)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-3-21f5d7dc9c47> in <module>
----> 1 ds.tmin.isel(time=0).plot(robust=True)
NameError: name 'ds' is not defined
[4]:
m = ds.mean(dim=['lat', 'lon'])
m.tmin.plot()
m.tmax.plot()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-4-ef7652c33a7a> in <module>
----> 1 m = ds.mean(dim=['lat', 'lon'])
2 m.tmin.plot()
3 m.tmax.plot()
NameError: name 'ds' is not defined
Remapper Object¶
Say we want to downsample the input data from daily
to monthly
frequency. To achieve this, we create a remapper object, and pass in:
- An xarray Dataset containing the time, time boundary information of the incoming time axis.
- An outgoing frequency. For e.g ‘M’, ‘2D’, ‘H’, or ‘3T’ For full specification of available frequencies, please see here
[5]:
remapper = Remapper(ds, freq='M')
remapper
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-5-dd8fccf6b83e> in <module>
----> 1 remapper = Remapper(ds, freq='M')
2 remapper
NameError: name 'ds' is not defined
During the Remapper
object creation, xgriddedaxis
uses the incoming time axis information in conjunction with the specified frequency to construct an outgoing time axis information. This information is stored as an xarray Dataset in the .info
attribute of the remapper object:
[6]:
remapper.info
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-6-9abd55daf06e> in <module>
----> 1 remapper.info
NameError: name 'remapper' is not defined
The remapper is telling us that it can remap data from a daily time frequency with 731 incoming timesteps (731 days)
to monthly time frequency with 24 outgoing timesteps (24 months)
.
The remapping weights are stored as a sparse matrix (following the Coordinate List (COO) layout) in the weights
variable:
[7]:
remapper.info.weights.data
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-7-b2872189b48a> in <module>
----> 1 remapper.info.weights.data
NameError: name 'remapper' is not defined
[8]:
remapper.info.weights.data.todense()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-8-609d9ea0fe7a> in <module>
----> 1 remapper.info.weights.data.todense()
NameError: name 'remapper' is not defined
The outgoing time bounds are stored in the outgoing_time_bounds
variable:
[9]:
remapper.info.outgoing_time_bounds
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-9-a25bd9ad4eb6> in <module>
----> 1 remapper.info.outgoing_time_bounds
NameError: name 'remapper' is not defined
More information about the incoming and outgoing time axes is stored in the attrs
section:
[10]:
remapper.info.attrs
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-10-e5503ddef281> in <module>
----> 1 remapper.info.attrs
NameError: name 'remapper' is not defined
Performing remapping (resampling)¶
Now that we have an instance of the Remapper
object, we can tell xgriddedaxis
to convert data from the incoming time axis to the outgoing (destination) axis.
[11]:
tmin_out = remapper.average(ds.tmin)
tmin_out
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-11-d9de4c008527> in <module>
----> 1 tmin_out = remapper.average(ds.tmin)
2 tmin_out
NameError: name 'remapper' is not defined
Check results¶
[12]:
tmin_out.isel(time=0).plot(robust=True)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-12-6ba65521348e> in <module>
----> 1 tmin_out.isel(time=0).plot(robust=True)
NameError: name 'tmin_out' is not defined
Check broadcasting over extra dimensions¶
The remapping should affect the time dimension only. We can check that xgriddedaxis
tracks coordinate values over extra dimensions
[13]:
ds.lat
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-13-21970646b2de> in <module>
----> 1 ds.lat
NameError: name 'ds' is not defined
[14]:
# Passes if the output is exactly the same as the input
xr.testing.assert_identical(ds.lat, tmin_out.lat)
xr.testing.assert_identical(ds.lon, tmin_out.lon)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-14-22e9fd95de44> in <module>
1 # Passes if the output is exactly the same as the input
----> 2 xr.testing.assert_identical(ds.lat, tmin_out.lat)
3 xr.testing.assert_identical(ds.lon, tmin_out.lon)
NameError: name 'xr' is not defined
We can plot the time series at a specific location, to make sure the broadcasting is correct:
[15]:
ds.tmin.sel(lat=-90, lon=-180).plot()
tmin_out.sel(lat=-90, lon=-180).plot(color='red')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-15-ebffd2cf2e5d> in <module>
----> 1 ds.tmin.sel(lat=-90, lon=-180).plot()
2 tmin_out.sel(lat=-90, lon=-180).plot(color='red')
NameError: name 'ds' is not defined
[16]:
%load_ext watermark
%watermark -v -m -g -p xarray,xgriddedaxis,cftime,pandas
CPython 3.7.3
IPython 7.13.0
xarray 0.15.1
xgriddedaxis 0.0.post43
cftime 1.1.2
pandas 1.0.3
compiler : GCC 7.4.0
system : Linux
release : 5.0.0-1032-azure
machine : x86_64
processor : x86_64
CPU cores : 1
interpreter: 64bit
Git hash : b4e6bd9215269111a1dd602831c9d6ecec6f768d