{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Providing a bespoke function to the temporal reduce methods" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from earthkit.data.testing import earthkit_remote_test_data_file\n", "from statsmodels import api as sm\n", "\n", "from earthkit import data as ekd\n", "from earthkit.transforms import aggregate as ekt\n", "\n", "ekd.settings.set(\"cache-policy\", \"user\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load some test data\n", "\n", "All `earthkit-transforms` methods can be called with `earthkit-data` objects (Readers and Wrappers) or with the \n", "pre-loaded `xarray`.\n", "\n", "In this example we will use hourly ERA5 2m temperature data on a 0.5x0.5 spatial grid for the year 2015 as\n", "our physical data.\n", "\n", "First we download (if not already cached) lazily load the ERA5 data (please see tutorials in `earthkit-data` for more details in cache management).\n", "\n", "We convert the data to an `xarray.Dataset` with some additional options which are better suited for the data we are working with." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get some demonstration ERA5 data, this could be any url or path to an ERA5 grib or netCDF file.\n", "remote_era5_file = earthkit_remote_test_data_file(\"era5_temperature_europe_2015.grib\")\n", "era5_data = ekd.from_source(\"url\", remote_era5_file)\n", "era5_xr = era5_data.to_xarray(time_dim_mode=\"valid_time\").rename({\"2t\": \"t2m\"})\n", "era5_xr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reduce the ERA5 data using a bespoke function over the time dimension\n", "\n", "The default reduction method is `mean`, other methods can be applied using the `how` kwarg.\n", "\n", "Note that we do not need to worry about the data format of the input array, earthkit will convert it to the required xarray format internally." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define a bespoke function for the reduction\n", "\n", "When providing a our own function to the reduce method it must conform the the Xarray requirements defined in their [documentation pages](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.reduce.html#xarray-dataarray-reduce). Specifically:\n", "\n", "*\"Function which can be called in the form f(x, axis=axis, \\*\\*kwargs) to return the result of reducing an np.ndarray over an integer valued axis.\"*\n", "\n", "We will define a function which calculates the trend based on the [statsmodels Ordinary Least Squares model]https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLS.html.\n", "\n", "Note that you will need to install statsmodels for the following function, it is available via PyPi and conda." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def trend(x, axis=0, **kwargs):\n", " # Add a constant for the intercept\n", " X = sm.add_constant(np.arange(x.shape[axis]))\n", "\n", " _x = np.rollaxis(x, axis)\n", " _x_shape = _x.shape\n", "\n", " _x_flat = _x.reshape(_x_shape[0], -1)\n", " _out = np.zeros_like(_x_flat[0])\n", " for point in range(_x_flat.shape[1]):\n", " model = sm.OLS(_x_flat[:, point], X).fit()\n", " _out[point] = model.params[1]\n", "\n", " return _out.reshape(_x_shape[1:])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now use this method in our temporal reduce function to calculate the trend at each grid point." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "era5_t_trend = ekt.temporal.reduce(era5_xr, how=trend)\n", "era5_t_trend" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A simple matplotlib plot to view the data:\n", "era5_t_trend.t2m.plot(cmap=\"coolwarm\", vmin=-0.01, vmax=0.01)\n", "plt.title(\"ERA5 2m Temperature Trend 20150101\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": ".conda", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.12" } }, "nbformat": 4, "nbformat_minor": 2 }