Using Bayesian Model Averaging to Calibrate Forecast Ensembles

Top Cited Papers

1 May 2005

journal article
Published by American Meteorological Society in Monthly Weather Review

Vol. 133 (5) , 1155-1174
https://doi.org/10.1175/mwr2906.1

Abstract

Ensembles used for probabilistic weather forecasting often exhibit a spread-error correlation, but they tend to be underdispersive. This paper proposes a statistical method for postprocessing ensembles based on Bayesian model averaging (BMA), which is a standard method for combining predictive distributions from different sources. The BMA predictive probability density function (PDF) of any quantity of interest is a weighted average of PDFs centered on the individual bias-corrected forecasts, where the weights are equal to posterior probabilities of the models generating the forecasts and reflect the models' relative contributions to predictive skill over the training period. The BMA weights can be used to assess the usefulness of ensemble members, and this can be used as a basis for selecting ensemble members; this can be useful given the cost of running large ensembles. The BMA PDF can be represented as an unweighted ensemble of any desired size, by simulating from the BMA predictive distribution.The BMA predictive variance can be decomposed into two components, one corresponding to the between-forecast variability, and the second to the within-forecast variability. Predictive PDFs or intervals based solely on the ensemble spread incorporate the first component but not the second. Thus BMA provides a theoretical explanation of the tendency of ensembles to exhibit a spread-error correlation but yet be underdispersive.The method was applied to 48-h forecasts of surface temperature in the Pacific Northwest in January–June 2000 using the University of Washington fifth-generation Pennsylvania State University–NCAR Mesoscale Model (MM5) ensemble. The predictive PDFs were much better calibrated than the raw ensemble, and the BMA forecasts were sharp in that 90% BMA prediction intervals were 66% shorter on average than those produced by sample climatology. As a by-product, BMA yields a deterministic point forecast, and this had root-mean-square errors 7% lower than the best of the ensemble members and 8% lower than the ensemble mean. Similar results were obtained for forecasts of sea level pressure. Simulation experiments show that BMA performs reasonably well when the underlying ensemble is calibrated, or even overdispersed.

Keywords

This publication has 42 references indexed in Scilit:

A Comparison of the ECMWF, MSC, and NCEP Global Ensemble Prediction Systems
Monthly Weather Review, 2005
Calibrated Probabilistic Mesoscale Weather Field Forecasting
Journal of the American Statistical Association, 2004
Initial Results of a Mesoscale Short-Range Ensemble Forecasting System over the Pacific Northwest
Weather and Forecasting, 2002
Stochastic representation of model uncertainties in the ECMWF ensemble prediction system
Quarterly Journal of the Royal Meteorological Society, 1999
Calibrated Probabilistic Quantitative Precipitation Forecasts Based on theMRF Ensemble
Weather and Forecasting, 1998
Potential Forecast Skill of Ensemble Prediction and Spread and Skill Distributions of the ECMWF Ensemble Prediction System
Monthly Weather Review, 1997
A Method for Producing and Evaluating Probabilistic Forecasts from Ensemble Model Integrations
Journal of Climate, 1996
Statistical Forecasts Based on the National Meteorological Center's Numerical Weather Prediction System
Weather and Forecasting, 1989
The Use of Model Output Statistics (MOS) in Objective Weather Forecasting
Journal of Applied Meteorology, 1972
Stochastic dynamic prediction1
Tellus A: Dynamic Meteorology and Oceanography, 1969