Spatial domain | Global |
Spatial resolution | 0.25 degrees (~20km) |
Time domain | 2000-01-01 00:00:00 UTC to Present |
Time resolution | 3.0 hours |
⎘
The Global Ensemble Forecast System (GEFS) is a National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction (NCEP) weather forecast model.
This analysis dataset is an archive of the model's best estimate of past weather. It is created by concatenating the first few hours of each historical forecast to provide a dataset with dimensions time, latitude, and longitude.
This dataset is designed to be used in conjunction with the GEFS forecast 35 day dataset.
Storage for this dataset is generously provided by Source Cooperative, a Radiant Earth initiative.
min | max | units | |
---|---|---|---|
latitude | -90 | 90 | degrees_north |
longitude | -180 | 179.75 | degrees_east |
time | 2000-01-01T00:00:00 | Present | seconds since 1970-01-01 |
units | dimensions | |
---|---|---|
categorical_freezing_rain_surface | 0=no; 1=yes | time × latitude × longitude |
categorical_ice_pellets_surface | 0=no; 1=yes | time × latitude × longitude |
categorical_rain_surface | 0=no; 1=yes | time × latitude × longitude |
categorical_snow_surface | 0=no; 1=yes | time × latitude × longitude |
downward_long_wave_radiation_flux_surface | W/(m^2) | time × latitude × longitude |
downward_short_wave_radiation_flux_surface | W/(m^2) | time × latitude × longitude |
geopotential_height_cloud_ceiling | gpm | time × latitude × longitude |
maximum_temperature_2m | C | time × latitude × longitude |
minimum_temperature_2m | C | time × latitude × longitude |
percent_frozen_precipitation_surface | % | time × latitude × longitude |
precipitable_water_atmosphere | kg/(m^2) | time × latitude × longitude |
precipitation_surface | mm/s | time × latitude × longitude |
pressure_reduced_to_mean_sea_level | Pa | time × latitude × longitude |
pressure_surface | Pa | time × latitude × longitude |
relative_humidity_2m | % | time × latitude × longitude |
temperature_2m | C | time × latitude × longitude |
total_cloud_cover_atmosphere | % | time × latitude × longitude |
wind_u_100m | m/s | time × latitude × longitude |
wind_u_10m | m/s | time × latitude × longitude |
wind_v_100m | m/s | time × latitude × longitude |
wind_v_10m | m/s | time × latitude × longitude |
Open notebook in github | Open notebook in colab |
import xarray as xr # xarray>=2025.1.2 and zarr>=3.0.4 for zarr v3 support
ds = xr.open_zarr("https://data.dynamical.org/noaa/gefs/analysis/[email protected]")
ds['temperature_2m'].sel(time="2025-01-01T00", latitude=0, longitude=0).compute()
To provide the longest possible historical record, this dataset in constructed from three distinct GEFS forecast archives.
Data is available for all variables at all times with the following exceptions.
relative_humidity_2m
,
percent_frozen_precipitation_surface
,
categorical_freezing_rain_surface
,
categorical_ice_pellets_surface
,
categorical_rain_surface
,
categorical_snow_surface
geopotential_height_cloud_ceiling
To create a single time dimension we concatenate the first few hours of each forecast.
From 2000-01-01 to 2019-12-31 reforecasts are available once per day and this dataset
uses the first 21 or 24 hours of each forecast. From 2020-01-01 to present forecasts
are available every 6 hours and this dataset uses the first 3 or 6 hours of each forecast.
Variables with an instantaneous step_type
use the shortest possible lead times
(e.g. 0 and 3 hours) while accumulated variables must use one additional forecast
step (e.g. 3 and 6 hours) because they do not have an hour 0 forecast value.
For most of the time range of the archive the source data is available at 0.25-degree resolution and a 3 hourly time step and we perform no interpolation. There are two exceptions to this. 1) From 2020-01-01 to 2020-09-23 the source data has a 1.0-degree spatial resolution and a 6 hourly time step. 2) From 2020-09-23 to present the 100m wind components have a 0.5-degree spatial resolution in the source data. To provide a consistent archive in the above two cases we first perform bilinear interpolation in space to 0.25-degree resolution followed by linear interpolation in time to a 3-hourly timestep if necessary. The original, uninterpolated data can be obtained by selecting latitudes and longitudes evenly divisible by 1 and, in case 1), time steps whose hour is divisible by 6.
The data values in this dataset have been rounded in their binary floating point representation to improve compression. See Klöwer et al. 2021 for more information on this approach. The exact number of rounded bits can be found in our reformatting code.