Weather forecast evaluation

There are seemingly infinite ways to evaluate a forecast: by lead time, by variable, by region, against observations or against a reanalysis, with metrics that reward sharpness or metrics that reward calibration. Each choice encodes an opinion about what "good" means.

We explore the tradeoffs of these choices, with the aim to create a framework to determine which forecasts to trust for a given situation. We are particularly interested in how probabilistic forecasts should be optimally leveraged in human-driven decision-making processes.

Projects

Global Airport Observations (ASOS/AWOS)
Global METAR surface observations, 1940–present, as cloud-native GeoParquet — ground truth for scoring forecasts against what actually happened.

Project →
Scorecard
Operational skill scores for forecast models, scored against observations.

Project →

Projects

Global Airport Observations (ASOS/AWOS)

Scorecard