Compare time axes of multiple netCDF files

Let’s get the maximum intersection of multiple test netCDF files’ time axes.

[1]:
import pyhomogenize as pyh
[2]:
file1 = pyh.test_netcdf[0]
file2 = pyh.test_netcdf[1]
file1.split("/")[-1], file2.split("/")[-1]
[2]:
('tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200707.nc',
 'tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200712.nc')
[3]:
time_compare = pyh.time_compare(file1, file2)
[4]:
time_compare.max_intersection()
[4]:
(cftime.DatetimeNoLeap(2007, 1, 16, 12, 0, 0, 0, has_year_zero=True),
 cftime.DatetimeNoLeap(2007, 7, 16, 12, 0, 0, 0, has_year_zero=True))

We also can use time_control or xarray Datasets insteaf of files on disk.

[5]:
import xarray as xr
[6]:
time_control = pyh.time_control(pyh.test_netcdf[0])
xr_dataset = xr.open_dataset(pyh.test_netcdf[1])
file3 = pyh.test_netcdf[2]
[7]:
time_control.time
[7]:
CFTimeIndex([2007-01-16 12:00:00, 2007-02-15 00:00:00, 2007-03-16 12:00:00,
             2007-04-16 00:00:00, 2007-05-16 12:00:00, 2007-06-16 00:00:00,
             2007-07-16 12:00:00],
            dtype='object', length=7, calendar='noleap', freq='None')
[8]:
xr_dataset.time
[8]:
<xarray.DataArray 'time' (time: 12)>
array([cftime.DatetimeNoLeap(2007, 1, 16, 12, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 2, 15, 0, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 3, 16, 12, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 4, 16, 0, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 5, 16, 12, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 6, 16, 0, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 7, 16, 12, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 8, 16, 12, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 9, 16, 0, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 10, 16, 12, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 11, 16, 0, 0, 0, 0, has_year_zero=True),
       cftime.DatetimeNoLeap(2007, 12, 16, 12, 0, 0, 0, has_year_zero=True)],
      dtype=object)
Coordinates:
  * time     (time) object 2007-01-16 12:00:00 ... 2007-12-16 12:00:00
    height   float64 2.0
Attributes:
    standard_name:  time
    long_name:      time
    bounds:         time_bnds
    axis:           T
[9]:
file3.split("/")[-1]
[9]:
'tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200705-200712.nc'
[10]:
time_compare_2 = pyh.time_compare(time_control, xr_dataset, file3)
time_compare_2.max_intersection()
[10]:
(cftime.DatetimeNoLeap(2007, 6, 16, 0, 0, 0, 0, has_year_zero=True),
 cftime.DatetimeNoLeap(2007, 7, 16, 12, 0, 0, 0, has_year_zero=True))

Now, we want to select the maximum time intersection from each input.

[11]:
from copy import copy

time_control_2 = copy(time_control)
xr_dataset_2 = xr_dataset.copy()
selected = time_compare_2.select_max_intersection()
selected
[11]:
[<pyhomogenize._time_control.time_control at 0x7f6342a43af0>,
 <pyhomogenize._time_control.time_control at 0x7f63419238b0>,
 <pyhomogenize._time_control.time_control at 0x7f63419239a0>]

As we can see we got a list of three time_control objects. Let’s have a look at their time axes.

[12]:
selected[0].time
[12]:
CFTimeIndex([2007-06-16 00:00:00, 2007-07-16 12:00:00],
            dtype='object', length=2, calendar='noleap', freq=None)
[13]:
selected[1].time
[13]:
CFTimeIndex([2007-06-16 00:00:00, 2007-07-16 12:00:00],
            dtype='object', length=2, calendar='noleap', freq=None)
[14]:
selected[2].time
[14]:
CFTimeIndex([2007-06-16 00:00:00, 2007-07-16 12:00:00],
            dtype='object', length=2, calendar='noleap', freq=None)