{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Providing a bespoke function to the temporal reduce methods" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from statsmodels import api as sm\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "from earthkit.transforms import aggregate as ekt\n", "from earthkit import data as ekd\n", "\n", "from earthkit.data.testing import earthkit_remote_test_data_file\n", "ekd.settings.set(\"cache-policy\", \"user\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load some test data\n", "\n", "All `earthkit-transforms` methods can be called with `earthkit-data` objects (Readers and Wrappers) or with the \n", "pre-loaded `xarray`.\n", "\n", "In this example we will use hourly ERA5 2m temperature data on a 0.5x0.5 spatial grid for the year 2015 as\n", "our physical data.\n", "\n", "First we download (if not already cached) lazily load the ERA5 data (please see tutorials in `earthkit-data` for more details in cache management).\n", "\n", "We convert the data to an `xarray.Dataset` with some additional options which are better suited for the data we are working with." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get some demonstration ERA5 data, this could be any url or path to an ERA5 grib or netCDF file.\n", "remote_era5_file = earthkit_remote_test_data_file(\"era5_temperature_europe_2015.grib\")\n", "era5_data = ekd.from_source(\"url\", remote_era5_file)\n", "era5_xr = era5_data.to_xarray(time_dim_mode=\"valid_time\").rename({\"2t\": \"t2m\"})\n", "era5_xr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reduce the ERA5 data using a bespoke function over the time dimension\n", "\n", "The default reduction method is `mean`, other methods can be applied using the `how` kwarg.\n", "\n", "Note that we do not need to worry about the data format of the input array, earthkit will convert it to the required xarray format internally." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define a bespoke function for the reduction\n", "\n", "When providing a our own function to the reduce method it must conform the the Xarray requirements defined in their [documentation pages](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.reduce.html#xarray-dataarray-reduce). Specifically:\n", "\n", "*\"Function which can be called in the form f(x, axis=axis, \\*\\*kwargs) to return the result of reducing an np.ndarray over an integer valued axis.\"*\n", "\n", "We will define a function which calculates the trend based on the [statsmodels Ordinary Least Squares model]https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLS.html.\n", "\n", "Note that you will need to install statsmodels for the following function, it is available via PyPi and conda." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def trend(x, axis=0, **kwargs):\n", " # Add a constant for the intercept\n", " X = sm.add_constant(np.arange(x.shape[axis]))\n", "\n", " _x = np.rollaxis(x, axis)\n", " _x_shape = _x.shape\n", "\n", " _x_flat = _x.reshape(_x_shape[0], -1)\n", " _out = np.zeros_like(_x_flat[0])\n", " for point in range(_x_flat.shape[1]):\n", " model = sm.OLS(_x_flat[:, point], X).fit()\n", " _out[point] = model.params[1]\n", "\n", " return _out.reshape(_x_shape[1:])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now use this method in our temporal reduce function to calculate the trend at each grid point." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "era5_t_trend = ekt.temporal.reduce(era5_xr, how=trend)\n", "era5_t_trend" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A simple matplotlib plot to view the data:\n", "era5_t_trend.t2m.plot(cmap=\"coolwarm\", vmin=-0.01, vmax=0.01)\n", "plt.title(\"ERA5 2m Temperature Trend 20150101\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": ".conda", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.7" } }, "nbformat": 4, "nbformat_minor": 2 }