{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Daily statistics from six-hourly SEAS5 data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from earthkit.transforms import aggregate as ekt\n", "from earthkit import data as ekd\n", "\n", "from earthkit.data.testing import earthkit_remote_test_data_file\n", "ekd.settings.set(\"cache-policy\", \"user\")\n", "\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load some test data\n", "\n", "All `earthkit-transforms` methods can be called with `earthkit-data` objects (Readers and Wrappers) or with the \n", "pre-loaded `xarray`.\n", "\n", "In this example we will use three initialisation of the SEAS5 2m temperature data on a 1.x1. spatial grid. The temporal resolution is 6 hourly, and we have the forecasts for January, February and March 2015.\n", "\n", "First we download (if not already cached) and lazily load the SEAS5 data (please see tutorials in `earthkit-data` for more details in cache management).\n", "\n", "We convert the data to an `xarray.Dataset` object with some additional options better suited for the data we're handling." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get some demonstration ERA5 data, this could be any url or path to an ERA5 grib or netCDF file.\n", "remote_seas5_file = earthkit_remote_test_data_file(\"seas5_2m_temperature_201501-201503_europe_1deg.grib\")\n", "seas5_data = ekd.from_source(\"url\", remote_seas5_file)\n", "seas5_xr = seas5_data.to_xarray(time_dim_mode=\"forecast\", add_valid_time_coord=True).rename({\"2t\": \"t2m\"})\n", "seas5_xr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculate the daily median of the Seasonal Forecast data\n", "\n", "In this first example we will handle the forecast initialisations independently, i.e. return the daily median of the 3 different forecasts. To do this we must specify that the time-dimension we wish to calculate the aggregation over is the \"step\" dimension." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "seas_daily_median_by_step = ekt.temporal.daily_median(\n", " seas5_xr, time_dim=\"step\"\n", ")\n", "seas_daily_median_by_step.coords[\"valid_time\"] = (\n", " seas_daily_median_by_step[\"forecast_reference_time\"] + seas_daily_median_by_step[\"step\"]\n", ")\n", "seas_daily_median_by_step" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "seas5_daily_median_by_vt = ekt.temporal.daily_median(seas5_xr, time_dim=\"valid_time\")\n", "seas5_daily_median_by_vt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot a random point location to see the different aggregation methods" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "isel_kwargs = {\"latitude\":20, \"longitude\":20}\n", "\n", "fig, axes = plt.subplots(ncols=2, nrows=1, figsize=(15,5))\n", "\n", "forecast_colours = [\"blue\", \"red\", \"green\"]\n", "\n", "# era5_data.to_xarray().t2m.isel(**isel_kwargs).plot(label='Raw data', ax=ax)\n", "f_kwargs = {\"label\": \"Daily median over step\"}\n", "\n", "for itime in range(3):\n", " for number in range(25):\n", " t_data = seas_daily_median_by_step.t2m.isel(**isel_kwargs, number=number, forecast_reference_time=itime)\n", " if number == 0:\n", " extra_kwargs = {\"label\": f\"FC ref time: {str(t_data.forecast_reference_time.values)[:10]}\"}\n", " else:\n", " extra_kwargs = {}\n", " t_data.plot(\n", " x = \"valid_time\",\n", " ax=axes[0], color=forecast_colours[itime], **extra_kwargs\n", " )\n", "axes[0].legend(loc=2)\n", "axes[0].set_title(\"Daily median by step\")\n", "\n", "for number in range(25):\n", " t_data = seas5_daily_median_by_vt.t2m.isel(**isel_kwargs, number=number)\n", " extra_kwargs = {}\n", " t_data.plot(\n", " x = \"date\",\n", " ax=axes[1], color=forecast_colours[0], **extra_kwargs\n", " )\n", "\n", "axes[1].set_title(\"Daily median by valid_time\")\n", "\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": ".conda", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.7" } }, "nbformat": 4, "nbformat_minor": 2 }