{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# General temporal aggregation methods" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If first time running, uncomment the line below to install any additional dependencies\n", "# !bash requirements-for-notebooks.sh" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from earthkit.data.testing import earthkit_remote_test_data_file\n", "\n", "from earthkit import data as ekd\n", "from earthkit.transforms import aggregate as ekt\n", "\n", "ekd.settings.set(\"cache-policy\", \"user\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load some test data\n", "\n", "All `earthkit-transforms` methods can be called with `earthkit-data` objects (Readers and Wrappers) or with the \n", "pre-loaded `xarray`.\n", "\n", "In this example we will use hourly ERA5 2m temperature data on a 0.5x0.5 spatial grid for the year 2015 as\n", "our physical data.\n", "\n", "First we download (if not already cached) lazily load the ERA5 data (please see tutorials in `earthkit-data` for more details in cache management).\n", "\n", "We convert the data to an xarray dataset using some options which are preferred for our handling of the data we are working with. The earthkit transforms methods can handle out of the box earthkit data objects, but for clarity we create the xarray objects here." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get some demonstration ERA5 data, this could be any url or path to an ERA5 grib or netCDF file.\n", "remote_era5_file = earthkit_remote_test_data_file(\"era5_temperature_europe_2015.grib\")\n", "era5_data = ekd.from_source(\"url\", remote_era5_file)\n", "era5_xr = era5_data.to_xarray(time_dim_mode=\"valid_time\").rename({\"2t\": \"t2m\"})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reduce the ERA5 data over the time dimension\n", "\n", "The default reduction method is `mean`, other methods can be applied using the `how` kwarg.\n", "\n", "Note that we do not need to worry about the data format of the input array, earthkit will convert it to the required xarray format internally.\n", "\n", "The returned object is an xarray dataset, however this may change in future version of the package.\n", "\n", "### The mean over the time dimension" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "era5_t_mean = ekt.temporal.reduce(era5_xr) # how=\"mean\"\n", "era5_t_mean" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A simple matplotlib plot to view the data:\n", "era5_t_mean.t2m.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The median over the time dimension" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "era5_t_median = ekt.temporal.reduce(era5_xr, how=\"median\")\n", "era5_t_median" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A simple matplotlib plot to view the data:\n", "era5_t_median.t2m.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calling the temporal reduce method with an arbitary function\n", "\n", "The `temporal.reduce` method can take any method which is accepted by the xarray reduce method, typically this means it must take `axis` as an argument. See the [xarray.Dataset.reduce](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.reduce.html) documentation for more details." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "\n", "def my_method(array, axis=None, **kwargs):\n", " return np.mean(array, axis=axis, **kwargs) * np.std(array, axis=axis, **kwargs)\n", "\n", "\n", "era5_t_my_method = ekt.temporal.reduce(era5_xr, how=my_method, how_label=\"random\")\n", "era5_t_my_method" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A simple matplotlib plot to view the data:\n", "era5_t_my_method.t2m_random.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calculate a rolling mean with a 50 timestep window\n", "\n", "There is no temporal specific method for a rolling reduction. The general rolling_reduce method can do this calculation by specifying the dimension over which you would like to reduce." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "era5_rolling = ekt.rolling_reduce(\n", " era5_xr,\n", " valid_time=50,\n", " center=True,\n", ")\n", "era5_rolling" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": ".conda", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.7" } }, "nbformat": 4, "nbformat_minor": 2 }