Skip to main content
Dryad

Can ingredients based forecasting be learned? Disentangling a random forest's severe weather predictions

Abstract

Machine learning (ML)-based models have been rapidly integrated into forecast practices across the weather forecasting community in recent years. While ML tools introduce additional data to forecasting operations, there is a need for explainability to be available alongside the model output, such that the guidance can be transparent and trustworthy for the forecaster. This work makes use of the algorithm tree interpreter (TI) to disaggregate the contributions of meteorological features used in the Colorado State University Machine Learning Probabilities (CSU-MLP) system, a random forest-based ML tool that produces real-time probabilistic forecasts for severe weather using inputs from the Global Ensemble Forecast System v12. TI feature contributions are analyzed in time and space for CSU-MLP day-2 and 3 individual hazard (tornado, wind, and hail) forecasts and day-4 aggregate severe forecasts over a 2-yr period. For individual forecast periods, this work demonstrates that feature contributions derived from TI can be interpreted in an ingredients-based sense, effectively making the CSU-MLP probabilities physically interpretable. When investigated in an aggregate sense, TI illustrates that the CSU-MLP system's predictions use meteorological inputs in ways that are consistent with the spatiotemporal patterns seen in meteorological fields that pertain to severe storms climatology. This work concludes with a discussion on how these insights could be beneficial for model development, real-time forecast operations, and retrospective event analysis.