Longrange Weather Forecasting with Machine Learning: Khulna Example

A. Grossman¹, S. Bowman¹, F. Fabrizi², C. Mattioli², S. Ravela¹ and T. Reynolds²

1: Earth Signals and Systems Group, Earth, Atmospheric and Planetary Sciences, MIT
2: Lincoln Laboratory, MIT

This work is part of a joint initiative for quantifying risk from climate change and providing decision support.

NOT FOR PUBLIC RELEASE
INTERNAL NOTES FOR DISCUSSION

Forecasting Methodology:

A forecast is launched six months ahead every day. The data is either that which is measured for the last six months or one that was forecast. Thus, if measurements were are available all the way up to present time, then a forecast was launched from six months prior for tomorrow and beyond, typically up to 48 months.

Methods:

The methods use involve a site-specific data-driven deep learning and statistical learning framework, and a site-network graphical model. More details to follow after a complete peer review.

Timeseries Analysis and Longrange Maximum Daily Temperature Forecast (Khulna):

Longrange Forecasting Example

Longer Range Error Growth:

Previous (four years)

Persistence

Contrast the persistence model where we persist the monthly mean into the future. This is a degenerate timeseries model. A key number is the Test Error, which is much larger, as is the variance, though some months are decidedly better.

Decorrelation Temperature Error is ~6^o(F) seen from error growth graph. Thus, “last x years” persistence essentially is decorrelated error in test.

The Evidence for Phase Synchronization

We see substantial evidence for phase synchronization between indicators and temperature (some fields filled with NaNs are not correctly processed).

Therefore this can be predicted — we are able to predict within 0.1^o phase (which corresponds to within half a week) under zero lag conditions using purely the indicators available at the time of interest.

Naturally, efficacy under lagged forecast needs to still be evaluated. We also point out that using lag-free indicators to predict daily mean temperatures is only good to within 3^oC that too for “out-of-bag” prediction error assessment. The equivalent number for the lagged temperature forecast case shown in an ealier figure is 0.4^oC with 1.2 ^oC test error. Global indicators are still better than persistence, but the key test would be to study performance for lagged predictions.

Precipitation

The models developed for precipitation are NOT time series models. A separate model is developed for each month, and forecast lead time. These results suggest that 10% accuracy in total monthly rainfall prediction could be obtained.