Date created: June 10, 2022
Author: P. Alexander Burnham

Summary Last time we showed how spacetime can be used to rescale files at the spatial level. We can also use the library to scale and selected slices of a cube at the temporal scale.

Loading file names and create a cube

Let’s load two raster files that contain 101 bands each. Each band is a year. The datasets contain the same data, however, they are at different spatial scales. We will first create the file object

# read files and load data
data = ["demoData/Carya_ovata_sim_disc_10km.tif", "demoData/Carya_ovata_sim_disc_1km.tif"]
ds = read_data(data)

Next, we will rescale and trim the data to the same grid size. With no arguments specified, the functions will use the first file’s attributes to rescale the rest.

# scale data
scaledData = raster_align(ds)
trimmedData = raster_trim(scaledData)

Finally, we will make a cube Object. For demonstration purposes, we will assign each band to be a month for 101 months. Each file will be assigned to a variable (10km and 1km).

# set up time vec for 101 months starting from 2000-10-12
monthObj = cube_time(start="2000-10-12", length=101, scale = "month")
monthObj

# make cube
## DatetimeIndex(['2000-10-31', '2000-11-30', '2000-12-31', '2001-01-31',
##                '2001-02-28', '2001-03-31', '2001-04-30', '2001-05-31',
##                '2001-06-30', '2001-07-31',
##                ...
##                '2008-05-31', '2008-06-30', '2008-07-31', '2008-08-31',
##                '2008-09-30', '2008-10-31', '2008-11-30', '2008-12-31',
##                '2009-01-31', '2009-02-28'],
##               dtype='datetime64[ns]', length=101, freq=None)
cube = make_cube(data = trimmedData, fileName = "cube1.nc4", organizeFiles = "filestovar", organizeBands="bandstotime", timeObj = monthObj, varNames = ["10km", "1km"])

Scale time

This function takes a cube with a time dimension and rescales along that time dimension based on an intended scale and summarizing method. Here we calculate yearly averages by passing “year” to scale and “mean” to method.

# scale time by year
timeScaled = scale_time(cube=cube, scale="year", method="mean")

Let’s examine the data structure after rescaling

# extract data from cube
timeScaled.get_data_array()
<xarray.DataArray (variables: 2, lat: 149, lon: 297, time: 10)>
array([[[[         nan,          nan,          nan, ...,          nan,
                   nan,          nan],
         [     0.     ,      0.     ,      0.     , ...,      0.     ,
               0.     ,      0.     ],
         [     0.     ,      0.     ,      0.     , ...,      0.     ,
               0.     ,      0.     ],
         ...,
         [         nan,          nan,          nan, ...,          nan,
                   nan,          nan],
         [         nan,          nan,          nan, ...,          nan,
                   nan,          nan],
         [         nan,          nan,          nan, ...,          nan,
                   nan,          nan]],

        [[         nan,          nan,          nan, ...,          nan,
                   nan,          nan],
         [     0.     ,      0.     ,      0.     , ...,      0.     ,
               0.     ,      0.     ],
         [     0.     ,      0.     ,      0.     , ...,      0.     ,
               0.     ,      0.     ],
...
         [         nan,          nan,          nan, ...,          nan,
                   nan,          nan],
         [         nan,          nan,          nan, ...,          nan,
                   nan,          nan],
         [         nan,          nan,          nan, ...,          nan,
                   nan,          nan]],

        [[         nan,          nan,          nan, ...,          nan,
                   nan,          nan],
         [   950.7    ,   1722.4166 ,   1824.5    , ...,   1376.0834 ,
            1292.3334 ,   1195.     ],
         [   259.29257,   1384.8334 ,   1790.4166 , ...,   1123.0834 ,
             922.75   ,    789.5    ],
         ...,
         [         nan,          nan,          nan, ...,          nan,
                   nan,          nan],
         [         nan,          nan,          nan, ...,          nan,
                   nan,          nan],
         [         nan,          nan,          nan, ...,          nan,
                   nan,          nan]]]], dtype=float32)
Coordinates:
  * time       (time) datetime64[ns] 2000-12-31 2001-12-31 ... 2009-12-31
  * variables  (variables) <U4 '10km' '1km'
  * lon        (lon) float32 -83.75 -83.67 -83.58 -83.5 ... -59.25 -59.17 -59.08
  * lat        (lat) float32 49.0 48.92 48.83 48.75 ... 36.92 36.83 36.75 36.67

Now we have 9 time points at the yearly scale where averages between months are the temporal slices in the cube. The spatial scale has not changed.

Select Time

In addition to some simple rescaling, which will be expanded upon soon, we have the ability to select temporal slices within the cube using the selecct_time() function. I will add a new time vector to our cube at the daily scale for demonstration purposes.

# a new time vec
monthObjDay = cube_time(start="2000-10-12", length=101, scale = "day")

# make cube
cubeDay = make_cube(data = trimmedData, fileName = "cubeDay.nc4", organizeFiles = "filestovar", organizeBands="bandstotime", timeObj = monthObjDay, varNames = ["10km", "1km"])

Using the select time function, we can select a specific day/month/year within a range or for the entire dataset. For example, to choose only the first of each month for the entire cube we would write this:

# the first of the month
selectedDay = select_time(cube=cubeDay, range="entire", scale = "day", element=1)

# print the time vector
selectedDay.get_time()
## DatetimeIndex(['2000-11-01', '2000-12-01', '2001-01-01'], dtype='datetime64[ns]', freq=None)

Now we have a cube with a time dimension of length 3 where each slice is the first of each month in the dataset. We could also select months and do so within a range. Let’s return to our monthly data set and select Aprils between 2000 and 2005.

# select aprils between 2001 and 2003
selectedApril = select_time(cube=cube, range=["2000", "2005"], scale = "month", element=4)

# print the time vector
selectedApril.get_time()
## DatetimeIndex(['2001-04-30', '2002-04-30', '2003-04-30', '2004-04-30',
##                '2005-04-30'],
##               dtype='datetime64[ns]', freq=None)