No Arabic abstract
We assess the value of machine learning as an accelerator for the parameterisation schemes of operational weather forecasting systems, specifically the parameterisation of non-orographic gravity wave drag. Emulators of this scheme can be trained to produce stable and accurate results up to seasonal forecasting timescales. Generally, more complex networks produce more accurate emulators. By training on an increased complexity version of the existing parameterisation scheme we build emulators that produce more accurate forecasts. {For medium range forecasting we find evidence our emulators are more accurate} than the version of the parametrisation scheme that is used for operational predictions. Using the current operational CPU hardware our emulators have a similar computational cost to the existing scheme, but are heavily limited by data movement. On GPU hardware our emulators perform ten times faster than the existing scheme on a CPU.
The numerous recent breakthroughs in machine learning (ML) make imperative to carefully ponder how the scientific community can benefit from a technology that, although not necessarily new, is today living its golden age. This Grand Challenge review paper is focused on the present and future role of machine learning in space weather. The purpose is twofold. On one hand, we will discuss previous works that use ML for space weather forecasting, focusing in particular on the few areas that have seen most activity: the forecasting of geomagnetic indices, of relativistic electrons at geosynchronous orbits, of solar flares occurrence, of coronal mass ejection propagation time, and of solar wind speed. On the other hand, this paper serves as a gentle introduction to the field of machine learning tailored to the space weather community and as a pointer to a number of open challenges that we believe the community should undertake in the next decade. The recurring themes throughout the review are the need to shift our forecasting paradigm to a probabilistic approach focused on the reliable assessment of uncertainties, and the combination of physics-based and machine learning approaches, known as gray-box.
The formation of precipitation in state-of-the-art weather and climate models is an important process. The understanding of its relationship with other variables can lead to endless benefits, particularly for the worlds monsoon regions dependent on rainfall as a support for livelihood. Various factors play a crucial role in the formation of rainfall, and those physical processes are leading to significant biases in the operational weather forecasts. We use the UNET architecture of a deep convolutional neural network with residual learning as a proof of concept to learn global data-driven models of precipitation. The models are trained on reanalysis datasets projected on the cubed-sphere projection to minimize errors due to spherical distortion. The results are compared with the operational dynamical model used by the India Meteorological Department. The theoretical deep learning-based model shows doubling of the grid point, as well as area averaged skill measured in Pearson correlation coefficients relative to operational system. This study is a proof-of-concept showing that residual learning-based UNET can unravel physical relationships to target precipitation, and those physical constraints can be used in the dynamical operational models towards improved precipitation forecasts. Our results pave the way for the development of online, hybrid models in the future.
Modern weather and climate models share a common heritage, and often even components, however they are used in different ways to answer fundamentally different questions. As such, attempts to emulate them using machine learning should reflect this. While the use of machine learning to emulate weather forecast models is a relatively new endeavour there is a rich history of climate model emulation. This is primarily because while weather modelling is an initial condition problem which intimately depends on the current state of the atmosphere, climate modelling is predominantly a boundary condition problem. In order to emulate the response of the climate to different drivers therefore, representation of the full dynamical evolution of the atmosphere is neither necessary, or in many cases, desirable. Climate scientists are typically interested in different questions also. Indeed emulating the steady-state climate response has been possible for many years and provides significant speed increases that allow solving inverse problems for e.g. parameter estimation. Nevertheless, the large datasets, non-linear relationships and limited training data make Climate a domain which is rich in interesting machine learning challenges. Here I seek to set out the current state of climate model emulation and demonstrate how, despite some challenges, recent advances in machine learning provide new opportunities for creating useful statistical models of the climate.
In this essay, I outline a personal vision of how I think Numerical Weather Prediction (NWP) should evolve in the years leading up to 2030 and hence what it should look like in 2030. By NWP I mean initial-value predictions from timescales of hours to seasons ahead. Here I want to focus on how NWP can better help save lives from increasingly extreme weather in those parts of the world where society is most vulnerable. Whilst we can rightly be proud of many parts of our NWP heritage, its evolution has been influenced by national or institutional politics as well as by underpinning scientific principles. Sometimes these conflict with each other. It is important to be able to separate these issues when discussing how best meteorological science can serve society in 2030; otherwise any disruptive change - no matter how compelling the scientific case for it - becomes impossibly difficult.
Data-driven approaches, most prominently deep learning, have become powerful tools for prediction in many domains. A natural question to ask is whether data-driven methods could also be used to predict global weather patterns days in advance. First studies show promise but the lack of a common dataset and evaluation metrics make inter-comparison between studies difficult. Here we present a benchmark dataset for data-driven medium-range weather forecasting, a topic of high scientific interest for atmospheric and computer scientists alike. We provide data derived from the ERA5 archive that has been processed to facilitate the use in machine learning models. We propose simple and clear evaluation metrics which will enable a direct comparison between different methods. Further, we provide baseline scores from simple linear regression techniques, deep learning models, as well as purely physical forecasting models. The dataset is publicly available at https://github.com/pangeo-data/WeatherBench and the companion code is reproducible with tutorials for getting started. We hope that this dataset will accelerate research in data-driven weather forecasting.