Use pyhomogenize to check netCDF file(s) time axis; time_control
Now, we want to use pyhomogenize’s time_control
class. We open a test netCDF file. This will be done automatically by calling the class.
[1]:
import pyhomogenize as pyh
[2]:
time_control = pyh.time_control(pyh.test_netcdf[0])
time_control.ds
[2]:
<xarray.Dataset> Dimensions: (time: 7, bnds: 2, rlat: 412, rlon: 424, vertices: 4) Coordinates: * time (time) object 2007-01-16 12:00:00 ... 2007-07-16 12:00:00 lon (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray> lat (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray> * rlon (rlon) float64 -28.38 -28.26 -28.16 ... 17.93 18.05 18.16 * rlat (rlat) float64 -23.38 -23.26 -23.16 ... 21.61 21.73 21.83 height float64 ... Dimensions without coordinates: bnds, vertices Data variables: time_bnds (time, bnds) object dask.array<chunksize=(1, 2), meta=np.ndarray> lon_bnds (rlat, rlon, vertices) float64 dask.array<chunksize=(412, 424, 4), meta=np.ndarray> lat_bnds (rlat, rlon, vertices) float64 dask.array<chunksize=(412, 424, 4), meta=np.ndarray> rotated_pole int32 ... tas (time, rlat, rlon) float32 dask.array<chunksize=(1, 412, 424), meta=np.ndarray> Attributes: (12/26) CDI: Climate Data Interface version ?? (http:/... history: Fri Mar 25 10:44:26 2022: cdo seldate,200... source: CLMcom-CCLM4-8-17 institution: CLMcom, Climate Limited-area Modelling Co... Conventions: CF-1.4 contact: klima.projektionen@dwd.de ... ... project_id: CORDEX product: output frequency: mon tracking_id: 490ab140-e096-11e7-b22c-81c28a935756 creation_date: 2017-12-14T07:16:13Z CDO: Climate Data Operators version 1.9.3 (htt...
- time: 7
- bnds: 2
- rlat: 412
- rlon: 424
- vertices: 4
- time(time)object2007-01-16 12:00:00 ... 2007-07-...
- standard_name :
- time
- long_name :
- time
- bounds :
- time_bnds
- axis :
- T
array([cftime.DatetimeNoLeap(2007, 1, 16, 12, 0, 0, 0, has_year_zero=True), cftime.DatetimeNoLeap(2007, 2, 15, 0, 0, 0, 0, has_year_zero=True), cftime.DatetimeNoLeap(2007, 3, 16, 12, 0, 0, 0, has_year_zero=True), cftime.DatetimeNoLeap(2007, 4, 16, 0, 0, 0, 0, has_year_zero=True), cftime.DatetimeNoLeap(2007, 5, 16, 12, 0, 0, 0, has_year_zero=True), cftime.DatetimeNoLeap(2007, 6, 16, 0, 0, 0, 0, has_year_zero=True), cftime.DatetimeNoLeap(2007, 7, 16, 12, 0, 0, 0, has_year_zero=True)], dtype=object)
- lon(rlat, rlon)float64dask.array<chunksize=(412, 424), meta=np.ndarray>
- standard_name :
- longitude
- long_name :
- longitude
- units :
- degrees_east
- _CoordinateAxisType :
- Lon
- bounds :
- lon_bnds
Array Chunk Bytes 1.33 MiB 1.33 MiB Shape (412, 424) (412, 424) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - lat(rlat, rlon)float64dask.array<chunksize=(412, 424), meta=np.ndarray>
- standard_name :
- latitude
- long_name :
- latitude
- units :
- degrees_north
- _CoordinateAxisType :
- Lat
- bounds :
- lat_bnds
Array Chunk Bytes 1.33 MiB 1.33 MiB Shape (412, 424) (412, 424) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - rlon(rlon)float64-28.38 -28.26 ... 18.05 18.16
- standard_name :
- grid_longitude
- long_name :
- longitude in rotated pole grid
- units :
- degrees
- axis :
- X
array([-28.375 , -28.264999, -28.155001, ..., 17.934999, 18.045 , 18.155001])
- rlat(rlat)float64-23.38 -23.26 ... 21.73 21.83
- standard_name :
- grid_latitude
- long_name :
- latitude in rotated pole grid
- units :
- degrees
- axis :
- Y
array([-23.375 , -23.264999, -23.155001, ..., 21.615 , 21.725 , 21.834999])
- height()float64...
- standard_name :
- height
- long_name :
- height
- units :
- m
- positive :
- up
- axis :
- Z
array(2.)
- time_bnds(time, bnds)objectdask.array<chunksize=(1, 2), meta=np.ndarray>
Array Chunk Bytes 112 B 16 B Shape (7, 2) (1, 2) Count 15 Tasks 7 Chunks Type object numpy.ndarray - lon_bnds(rlat, rlon, vertices)float64dask.array<chunksize=(412, 424, 4), meta=np.ndarray>
Array Chunk Bytes 5.33 MiB 5.33 MiB Shape (412, 424, 4) (412, 424, 4) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - lat_bnds(rlat, rlon, vertices)float64dask.array<chunksize=(412, 424, 4), meta=np.ndarray>
Array Chunk Bytes 5.33 MiB 5.33 MiB Shape (412, 424, 4) (412, 424, 4) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - rotated_pole()int32...
- long_name :
- coordinates of the rotated North Pole
- grid_mapping_name :
- rotated_latitude_longitude
- grid_north_pole_latitude :
- 39.25
- grid_north_pole_longitude :
- -162.0
array(1, dtype=int32)
- tas(time, rlat, rlon)float32dask.array<chunksize=(1, 412, 424), meta=np.ndarray>
- standard_name :
- air_temperature
- long_name :
- Near-Surface Air Temperature
- units :
- K
- grid_mapping :
- rotated_pole
- cell_methods :
- time: mean
Array Chunk Bytes 4.66 MiB 682.38 kiB Shape (7, 412, 424) (1, 412, 424) Count 15 Tasks 7 Chunks Type float32 numpy.ndarray
- CDI :
- Climate Data Interface version ?? (http://mpimet.mpg.de/cdi)
- history :
- Fri Mar 25 10:44:26 2022: cdo seldate,2007-01-01,2007-07-31 pyhomogenize/data/netcdf/tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200712.nc tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200707.nc Tue Mar 08 08:23:19 2022: cdo selyear,2007 /work/kd0956/CORDEX/data/cordex/output/EUR-11/CLMcom/MIROC-MIROC5/rcp85/r1i1p1/CLMcom-CCLM4-8-17/v1/mon/tas/v20171121/tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200601-201012.nc tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200712.nc
- source :
- CLMcom-CCLM4-8-17
- institution :
- CLMcom, Climate Limited-area Modelling Community (CLM-Community)
- Conventions :
- CF-1.4
- contact :
- klima.projektionen@dwd.de
- CORDEX_domain :
- EUR-11
- institute_id :
- CLMcom
- driving_model_id :
- MIROC-MIROC5
- driving_experiment_name :
- rcp85
- experiment_id :
- rcp85
- driving_experiment :
- MIROC-MIROC5, rcp85, r1i1p1
- driving_model_ensemble_member :
- r1i1p1
- model_id :
- CLMcom-CCLM4-8-17
- rcm_version_id :
- v1
- conventionsURL :
- http://www.cfconventions.org
- references :
- http://cordex.clm-community.eu/
- comment :
- CORDEX Europe RCM CCLM 0.11 deg EUR-11
- experiment :
- RCP8.5
- policy :
- The Deutscher Wetterdienst (DWD) is the producer of the data. The General Terms and Conditions of Business and Delivery apply for services provided by DWD (http://www.dwd.de/EN/service/terms/terms.html).
- project_id :
- CORDEX
- product :
- output
- frequency :
- mon
- tracking_id :
- 490ab140-e096-11e7-b22c-81c28a935756
- creation_date :
- 2017-12-14T07:16:13Z
- CDO :
- Climate Data Operators version 1.9.3 (http://mpimet.mpg.de/cdo)
Let’s have a look on the datasets’s time axis
[3]:
time_control.time
[3]:
CFTimeIndex([2007-01-16 12:00:00, 2007-02-15 00:00:00, 2007-03-16 12:00:00,
2007-04-16 00:00:00, 2007-05-16 12:00:00, 2007-06-16 00:00:00,
2007-07-16 12:00:00],
dtype='object', length=7, calendar='noleap', freq='None')
We can check whether the time axis contains duplicated, missing or redundant time steps. A redundant time step is a time steps that does not math with the dataset’s calendar and/or frequency.
[4]:
duplicates = time_control.get_duplicates()
redundants = time_control.get_redundants()
missings = time_control.get_missings()
[5]:
duplicates, redundants, missings
[5]:
('', '', '')
We see the time axis doesn’t contain any incorrect time steps and no time steps are missing. Not really a auspicious example. We can combine the three above requests by using the function check_timestamps
.
[6]:
timechecker1 = time_control.check_timestamps()
timechecker1
[6]:
<pyhomogenize._time_control.time_control at 0x7f7dca066790>
As we can see the functions returns a time_control
object again but with three new attributes.
[7]:
timechecker1.duplicated_timesteps, timechecker1.missing_timesteps, timechecker1.redundant_timesteps
[7]:
({'tas': ''}, {'tas': ''}, {'tas': ''})
We want to test the time axis only for duplicated time steps.
timechecker2 = time_control.check_timestamps(selection=’duplicates’) timechecker2.duplicated_timesteps
By setting the parameter correct to the boolean value True
we can delete the duplicated and redundant time steps if exisitng. Of course, in our great example this is not the case.
[8]:
timechecker3 = time_control.check_timestamps(correct=True)
timechecker3.time
[8]:
CFTimeIndex([2007-01-16 12:00:00, 2007-02-15 00:00:00, 2007-03-16 12:00:00,
2007-04-16 00:00:00, 2007-05-16 12:00:00, 2007-06-16 00:00:00,
2007-07-16 12:00:00],
dtype='object', length=7, calendar='noleap', freq='None')
We can set the parameter output
to select the dataset’s output file name on disk. If so the parameter correct
is automatically set to True
.
[9]:
timechecker4 = time_control.check_timestamps(output="output.nc")
Now, we want to sleect a specific time range. We copy out time_control
object to keep the original object.
[10]:
from copy import copy
time_control1 = copy(time_control)
selected1 = time_control1.select_time_range(["2007-02-01", "2007-03-30"])
selected1
[10]:
<pyhomogenize._time_control.time_control at 0x7f7dc97fab80>
Here again, we get a time_control
object. But now with a different time axis.
[11]:
selected1.time
[11]:
CFTimeIndex([2007-02-15 00:00:00, 2007-03-16 12:00:00],
dtype='object', length=2, calendar='noleap', freq=None)
Of course, we can write the result as netCDF file on disk.
[12]:
time_control2 = copy(time_control)
selected2 = time_control2.select_time_range(
["2007-02-01", "2007-03-30"], output="output.nc"
)
selected2.ds
[12]:
<xarray.Dataset> Dimensions: (time: 2, bnds: 2, rlat: 412, rlon: 424, vertices: 4) Coordinates: * time (time) object 2007-02-15 00:00:00 2007-03-16 12:00:00 lon (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray> lat (rlat, rlon) float64 dask.array<chunksize=(412, 424), meta=np.ndarray> * rlon (rlon) float64 -28.38 -28.26 -28.16 ... 17.93 18.05 18.16 * rlat (rlat) float64 -23.38 -23.26 -23.16 ... 21.61 21.73 21.83 height float64 ... Dimensions without coordinates: bnds, vertices Data variables: time_bnds (time, bnds) object dask.array<chunksize=(1, 2), meta=np.ndarray> lon_bnds (rlat, rlon, vertices) float64 dask.array<chunksize=(412, 424, 4), meta=np.ndarray> lat_bnds (rlat, rlon, vertices) float64 dask.array<chunksize=(412, 424, 4), meta=np.ndarray> rotated_pole int32 ... tas (time, rlat, rlon) float32 dask.array<chunksize=(1, 412, 424), meta=np.ndarray> Attributes: (12/26) CDI: Climate Data Interface version ?? (http:/... history: Fri Mar 25 10:44:26 2022: cdo seldate,200... source: CLMcom-CCLM4-8-17 institution: CLMcom, Climate Limited-area Modelling Co... Conventions: CF-1.4 contact: klima.projektionen@dwd.de ... ... project_id: CORDEX product: output frequency: mon tracking_id: 490ab140-e096-11e7-b22c-81c28a935756 creation_date: 2017-12-14T07:16:13Z CDO: Climate Data Operators version 1.9.3 (htt...
- time: 2
- bnds: 2
- rlat: 412
- rlon: 424
- vertices: 4
- time(time)object2007-02-15 00:00:00 2007-03-16 1...
- standard_name :
- time
- long_name :
- time
- bounds :
- time_bnds
- axis :
- T
array([cftime.DatetimeNoLeap(2007, 2, 15, 0, 0, 0, 0, has_year_zero=True), cftime.DatetimeNoLeap(2007, 3, 16, 12, 0, 0, 0, has_year_zero=True)], dtype=object)
- lon(rlat, rlon)float64dask.array<chunksize=(412, 424), meta=np.ndarray>
- standard_name :
- longitude
- long_name :
- longitude
- units :
- degrees_east
- _CoordinateAxisType :
- Lon
- bounds :
- lon_bnds
Array Chunk Bytes 1.33 MiB 1.33 MiB Shape (412, 424) (412, 424) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - lat(rlat, rlon)float64dask.array<chunksize=(412, 424), meta=np.ndarray>
- standard_name :
- latitude
- long_name :
- latitude
- units :
- degrees_north
- _CoordinateAxisType :
- Lat
- bounds :
- lat_bnds
Array Chunk Bytes 1.33 MiB 1.33 MiB Shape (412, 424) (412, 424) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - rlon(rlon)float64-28.38 -28.26 ... 18.05 18.16
- standard_name :
- grid_longitude
- long_name :
- longitude in rotated pole grid
- units :
- degrees
- axis :
- X
array([-28.375 , -28.264999, -28.155001, ..., 17.934999, 18.045 , 18.155001])
- rlat(rlat)float64-23.38 -23.26 ... 21.73 21.83
- standard_name :
- grid_latitude
- long_name :
- latitude in rotated pole grid
- units :
- degrees
- axis :
- Y
array([-23.375 , -23.264999, -23.155001, ..., 21.615 , 21.725 , 21.834999])
- height()float64...
- standard_name :
- height
- long_name :
- height
- units :
- m
- positive :
- up
- axis :
- Z
array(2.)
- time_bnds(time, bnds)objectdask.array<chunksize=(1, 2), meta=np.ndarray>
Array Chunk Bytes 32 B 16 B Shape (2, 2) (1, 2) Count 17 Tasks 2 Chunks Type object numpy.ndarray - lon_bnds(rlat, rlon, vertices)float64dask.array<chunksize=(412, 424, 4), meta=np.ndarray>
Array Chunk Bytes 5.33 MiB 5.33 MiB Shape (412, 424, 4) (412, 424, 4) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - lat_bnds(rlat, rlon, vertices)float64dask.array<chunksize=(412, 424, 4), meta=np.ndarray>
Array Chunk Bytes 5.33 MiB 5.33 MiB Shape (412, 424, 4) (412, 424, 4) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - rotated_pole()int32...
- long_name :
- coordinates of the rotated North Pole
- grid_mapping_name :
- rotated_latitude_longitude
- grid_north_pole_latitude :
- 39.25
- grid_north_pole_longitude :
- -162.0
array(1, dtype=int32)
- tas(time, rlat, rlon)float32dask.array<chunksize=(1, 412, 424), meta=np.ndarray>
- standard_name :
- air_temperature
- long_name :
- Near-Surface Air Temperature
- units :
- K
- grid_mapping :
- rotated_pole
- cell_methods :
- time: mean
- associated_files :
- /home/ludwig/git/pyhomogenize/pyhomogenize/data/netcdf/tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200707.nc, /home/ludwig/git/pyhomogenize/pyhomogenize/data/netcdf/tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200707.nc
Array Chunk Bytes 1.33 MiB 682.38 kiB Shape (2, 412, 424) (1, 412, 424) Count 17 Tasks 2 Chunks Type float32 numpy.ndarray
- CDI :
- Climate Data Interface version ?? (http://mpimet.mpg.de/cdi)
- history :
- Fri Mar 25 10:44:26 2022: cdo seldate,2007-01-01,2007-07-31 pyhomogenize/data/netcdf/tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200712.nc tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200707.nc Tue Mar 08 08:23:19 2022: cdo selyear,2007 /work/kd0956/CORDEX/data/cordex/output/EUR-11/CLMcom/MIROC-MIROC5/rcp85/r1i1p1/CLMcom-CCLM4-8-17/v1/mon/tas/v20171121/tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200601-201012.nc tas_EUR-11_MIROC-MIROC5_rcp85_r1i1p1_CLMcom-CCLM4-8-17_v1_mon_200701-200712.nc
- source :
- CLMcom-CCLM4-8-17
- institution :
- CLMcom, Climate Limited-area Modelling Community (CLM-Community)
- Conventions :
- CF-1.4
- contact :
- klima.projektionen@dwd.de
- CORDEX_domain :
- EUR-11
- institute_id :
- CLMcom
- driving_model_id :
- MIROC-MIROC5
- driving_experiment_name :
- rcp85
- experiment_id :
- rcp85
- driving_experiment :
- MIROC-MIROC5, rcp85, r1i1p1
- driving_model_ensemble_member :
- r1i1p1
- model_id :
- CLMcom-CCLM4-8-17
- rcm_version_id :
- v1
- conventionsURL :
- http://www.cfconventions.org
- references :
- http://cordex.clm-community.eu/
- comment :
- CORDEX Europe RCM CCLM 0.11 deg EUR-11
- experiment :
- RCP8.5
- policy :
- The Deutscher Wetterdienst (DWD) is the producer of the data. The General Terms and Conditions of Business and Delivery apply for services provided by DWD (http://www.dwd.de/EN/service/terms/terms.html).
- project_id :
- CORDEX
- product :
- output
- frequency :
- mon
- tracking_id :
- 490ab140-e096-11e7-b22c-81c28a935756
- creation_date :
- 2017-12-14T07:16:13Z
- CDO :
- Climate Data Operators version 1.9.3 (http://mpimet.mpg.de/cdo)
If we want to crop or limit the time axis to a user-specified start and end month values as shown in the above example basics.date_range_to_frequency_limits
we can do this with netCDF files as well. The time axis should start with the start of an arbitrary season and end with the end of an arbitrary season.
[13]:
time_control3 = copy(time_control)
selected3 = time_control3.select_limited_time_range(
smonth=[3, 6, 9, 12], emonth=[2, 5, 8, 11], output="output.nc"
)
selected3.time
[13]:
CFTimeIndex([2007-03-16 12:00:00, 2007-04-16 00:00:00, 2007-05-16 12:00:00],
dtype='object', length=3, calendar='noleap', freq='732H')
Now, we want to check whether the time axis is within certain left and right bounds.
[14]:
time_control.within_time_range(["2007-02-01", "2007-03-30"])
[14]:
True
[15]:
time_control.within_time_range(["2007-02-01", "2008-03-30"])
[15]:
False
[16]:
time_control.within_time_range(["20070201", "20070330"], fmt="%Y%m%d")
[16]:
True