No Arabic abstract
Modern weather and climate models share a common heritage, and often even components, however they are used in different ways to answer fundamentally different questions. As such, attempts to emulate them using machine learning should reflect this. While the use of machine learning to emulate weather forecast models is a relatively new endeavour there is a rich history of climate model emulation. This is primarily because while weather modelling is an initial condition problem which intimately depends on the current state of the atmosphere, climate modelling is predominantly a boundary condition problem. In order to emulate the response of the climate to different drivers therefore, representation of the full dynamical evolution of the atmosphere is neither necessary, or in many cases, desirable. Climate scientists are typically interested in different questions also. Indeed emulating the steady-state climate response has been possible for many years and provides significant speed increases that allow solving inverse problems for e.g. parameter estimation. Nevertheless, the large datasets, non-linear relationships and limited training data make Climate a domain which is rich in interesting machine learning challenges. Here I seek to set out the current state of climate model emulation and demonstrate how, despite some challenges, recent advances in machine learning provide new opportunities for creating useful statistical models of the climate.
This popular article provides a short summary of the progress and prospects in Weather and Climate Modelling for the benefit of high school and undergraduate college students and early career researchers. Although this is not a comprehensive scientific article, the basic information provided here is intended to introduce students and researchers to the topic of Weather and Climate Modelling - which comes under the broad discipline of Atmospheric / Oceanic / Climate / Earth Sciences. This article briefly summarizes the historical developments, progress, scientific challenges in weather and climate modelling and career opportunities.
Climate models are complicated software systems that approximate atmospheric and oceanic fluid mechanics at a coarse spatial resolution. Typical climate forecasts only explicitly resolve processes larger than 100 km and approximate any process occurring below this scale (e.g. thunderstorms) using so-called parametrizations. Machine learning could improve upon the accuracy of some traditional physical parametrizations by learning from so-called global cloud-resolving models. We compare the performance of two machine learning models, random forests (RF) and neural networks (NNs), at parametrizing the aggregate effect of moist physics in a 3 km resolution global simulation with an atmospheric model. The NN outperforms the RF when evaluated offline on a testing dataset. However, when the ML models are coupled to an atmospheric model run at 200 km resolution, the NN-assisted simulation crashes with 7 days, while the RF-assisted simulations remain stable. Both runs produce more accurate weather forecasts than a baseline configuration, but globally averaged climate variables drift over longer timescales.
Statistical postprocessing techniques are nowadays key components of the forecasting suites in many National Meteorological Services (NMS), with for most of them, the objective of correcting the impact of different types of errors on the forecasts. The final aim is to provide optimal, automated, seamless forecasts for end users. Many techniques are now flourishing in the statistical, meteorological, climatological, hydrological, and engineering communities. The methods range in complexity from simple bias corrections to very sophisticated distribution-adjusting techniques that incorporate correlations among the prognostic variables. The paper is an attempt to summarize the main activities going on this area from theoretical developments to operational applications, with a focus on the current challenges and potential avenues in the field. Among these challenges is the shift in NMS towards running ensemble Numerical Weather Prediction (NWP) systems at the kilometer scale that produce very large datasets and require high-density high-quality observations; the necessity to preserve space time correlation of high-dimensional corrected fields; the need to reduce the impact of model changes affecting the parameters of the corrections; the necessity for techniques to merge different types of forecasts and ensembles with different behaviors; and finally the ability to transfer research on statistical postprocessing to operations. Potential new avenues will also be discussed.
We assess the value of machine learning as an accelerator for the parameterisation schemes of operational weather forecasting systems, specifically the parameterisation of non-orographic gravity wave drag. Emulators of this scheme can be trained to produce stable and accurate results up to seasonal forecasting timescales. Generally, more complex networks produce more accurate emulators. By training on an increased complexity version of the existing parameterisation scheme we build emulators that produce more accurate forecasts. {For medium range forecasting we find evidence our emulators are more accurate} than the version of the parametrisation scheme that is used for operational predictions. Using the current operational CPU hardware our emulators have a similar computational cost to the existing scheme, but are heavily limited by data movement. On GPU hardware our emulators perform ten times faster than the existing scheme on a CPU.
Global climate models represent small-scale processes such as clouds and convection using quasi-empirical models known as parameterizations, and these parameterizations are a leading cause of uncertainty in climate projections. A promising alternative approach is to use machine learning to build new parameterizations directly from high-resolution model output. However, parameterizations learned from three-dimensional model output have not yet been successfully used for simulations of climate. Here we use a random forest to learn a parameterization of subgrid processes from output of a three-dimensional high-resolution atmospheric model. Integrating this parameterization into the atmospheric model leads to stable simulations at coarse resolution that replicate the climate of the high-resolution simulation. The parameterization obeys physical constraints and captures important statistics such as precipitation extremes. The ability to learn from a fully three-dimensional simulation presents an opportunity for learning parameterizations from the wide range of global high-resolution simulations that are now emerging.