Climatologies and anomalies

What is a climatology?

A climatology is a long-term average of a variable, such as temperature or precipitation, over a specific period of time. Typically, climatologies are calculated over a 30-year period, but the specific period can vary depending on the context and the data available. Climatologies are often used to understand the typical conditions of a region and to identify anomalies, which are deviations from the long-term average. Often climatologies are calculated for a given frequency, such as daily or monthly, to understand the typical seasonal conditions for each day of the year or each month.

Calculating climatologies and anomalies with earthkit.transforms

The earthkit.transforms.climatology package module includes methods for aggregating climatologies and anomalies of data, including grouping the data in the time dimension to produce climatological daily or monthly values. The API follows a similar pattern to the temporal aggregation methods, where there is a generic climatology.reduce` method which allows all parameters to be passed, and a series of convenience methods which wrap the climatology.reduce method and set the how parameter and/or the frequency parameter.

Show API documentation for reduce
earthkit.transforms.climatology.reduce(dataarray: Dataset | DataArray, time_dim: str | None = None, how: str | Callable | None = 'mean', groupby_kwargs: dict | None = None, climatology_range: tuple | list | None = None, **reduce_kwargs)[source]

Group data annually over a given frequency and reduce using the specified how method.

Parameters:
  • dataarray (xarray.DataArray) – The DataArray over which to calculate the climatological mean. Must contain a time dimension.

  • how (str or callable) – Method used to reduce data. Default=’mean’, which will implement the xarray in-built mean. If string, it must be an in-built xarray reduce method, an earthkit how method or any method compatible with the array namespace of the data. In the case of duplicate names, method selection is first in the order: xarray, earthkit, array_namespace. Otherwise it can be any function which can be called in the form f(x, axis=axis, **kwargs) to return the result of reducing an array over an integer valued axis

  • frequency (str (optional)) – Frequency used for grouping the data in climatology mode. Typical values include dayofyear, weekofyear, month, year, etc. The full set of accepted options matches those supported by earthkit.transforms._tools.groupby_time. If not provided, the climatology is calculated over the entire period.

  • bin_widths (int or list (optional)) – If bin_widths is an int, it defines the width of each group bin on the frequency provided by frequency. If bin_widths is a sequence it defines the edges of each bin, allowing for non-uniform bin widths.

  • time_dim (str (optional)) – Name of the time dimension in the data object, default behaviour is to detect the time dimension from the input object

  • climatology_range ((list or tuple, optional)) – Start and end year of the period to be used for the reference climatology. Default is to use the entire time-series.

  • groupby_kwargs (dict) – Any other kwargs that are accepted by earthkit.transforms.aggregate.groupby_time

  • **reduce_kwargs – Any other kwargs that are accepted by earthkit.transforms.aggregate.reduce (except how)

Return type:

xarray.DataArray

The convenience methods are all documented in the API reference guide: earthkit.transforms.climatology package, and users are advised to refer to the notebook examples.