No Arabic abstract
A~machine learning framework is developed to estimate ocean-wave conditions. By supervised training of machine learning models on many thousands of iterations of a physics-based wave model, accurate representations of significant wave heights and period can be used to predict ocean conditions. A model of Monterey Bay was used as the example test site; it was forced by measured wave conditions, ocean-current nowcasts, and reported winds. These input data along with model outputs of spatially variable wave heights and characteristic period were aggregated into supervised learning training and test data sets, which were supplied to machine learning models. These machine learning models replicated wave heights with a root-mean-squared error of 9cm and correctly identify over 90% of the characteristic periods for the test-data sets. Impressively, transforming model inputs to outputs through matrix operations requires only a fraction (<1/1,000) of the computation time compared to forecasting with the physics-based model.
In situ and remotely sensed observations have potential to facilitate data-driven predictive models for oceanography. A suite of machine learning models, including regression, decision tree and deep learning approaches were developed to estimate sea surface temperatures (SST). Training data consisted of satellite-derived SST and atmospheric data from The Weather Company. Models were evaluated in terms of accuracy and computational complexity. Predictive skill were assessed against observations and a state-of-the-art, physics-based model from the European Centre for Medium Weather Forecasting. Results demonstrated that by combining automated feature engineering with machine-learning approaches, accuracy comparable to existing state-of-the-art can be achieved. Models captured seasonal patterns in the data and qualitatively reproduce short-term variations driven by atmospheric forcing. Further, it demonstrated that machine-learning-based approaches can be used as transportable prediction tools for ocean variables -- the data-driven nature of the approach naturally integrates with automatic deployment frameworks, where model deployments are guided by data rather than user-parametrisation and expertise. The low computational cost of inference makes the approach particularly attractive for edge-based computing where predictive models could be deployed on low-power devices in the marine environment.
This study investigated an approach to improve the accuracy of computationally lightweight surrogate models by updating forecasts based on historical accuracy relative to sparse observation data. Using a lightweight, ocean-wave forecasting model, we created a large number of model ensembles, with perturbed inputs, for a two-year study period. Forecasts were aggregated using a machine-learning algorithm that combined forecasts from multiple, independent models into a single best-estimate prediction of the true state. The framework was applied to a case-study site in Monterey Bay, California. A~learning-aggregation technique used historical observations and model forecasts to calculate a weight for each ensemble member. Weighted ensemble predictions were compared to measured wave conditions to evaluate performance against present state-of-the-art. Finally, we discussed how this framework, which integrates ensemble aggregations and surrogate models, can be used to improve forecasting systems and further enable scientific process studies.
A primary goal of the National Oceanic and Atmospheric Administration (NOAA) Warn-on-Forecast (WoF) project is to provide rapidly updating probabilistic guidance to human forecasters for short-term (e.g., 0-3 h) severe weather forecasts. Maximizing the usefulness of probabilistic severe weather guidance from an ensemble of convection-allowing model forecasts requires calibration. In this study, we compare the skill of a simple method using updraft helicity against a series of machine learning (ML) algorithms for calibrating WoFS severe weather guidance. ML models are often used to calibrate severe weather guidance since they leverage multiple variables and discover useful patterns in complex datasets. indent Our dataset includes WoF System (WoFS) ensemble forecasts available every 5 minutes out to 150 min of lead time from the 2017-2019 NOAA Hazardous Weather Testbed Spring Forecasting Experiments (81 dates). Using a novel ensemble storm track identification method, we extracted three sets of predictors from the WoFS forecasts: intra-storm state variables, near-storm environment variables, and morphological attributes of the ensemble storm tracks. We then trained random forests, gradient-boosted trees, and logistic regression algorithms to predict which WoFS 30-min ensemble storm tracks will correspond to a tornado, severe hail, and/or severe wind report. For the simple method, we extracted the ensemble probability of 2-5 km updraft helicity (UH) exceeding a threshold (tuned per severe weather hazard) from each ensemble storm track. The three ML algorithms discriminated well for all three hazards and produced more reliable probabilities than the UH-based predictions. Overall, the results suggest that ML-based calibrations of dynamical ensemble output can improve short term, storm-scale severe weather probabilistic guidance
Through ensemble-based data assimilation (DA), we address one of the most notorious difficulties in phase-resolved ocean wave forecast, regarding the deviation of numerical solution from the true surface elevation due to the chaotic nature of and underrepresented physics in the nonlinear wave models. In particular, we develop a coupled approach of the high-order spectral (HOS) method with the ensemble Kalman filter (EnKF), through which the measurement data can be incorporated into the simulation to improve the forecast performance. A unique feature in this coupling is the mismatch between the predictable zone and measurement region, which is accounted for through a special algorithm to modify the analysis equation in EnKF. We test the performance of the new EnKF-HOS method using both synthetic data and real radar measurements. For both cases (though differing in details), it is shown that the new method achieves much higher accuracy than the HOS-only method, and can retain the phase information of an irregular wave field for arbitrarily long forecast time with sequentially assimilated data.
We assess the value of machine learning as an accelerator for the parameterisation schemes of operational weather forecasting systems, specifically the parameterisation of non-orographic gravity wave drag. Emulators of this scheme can be trained to produce stable and accurate results up to seasonal forecasting timescales. Generally, more complex networks produce more accurate emulators. By training on an increased complexity version of the existing parameterisation scheme we build emulators that produce more accurate forecasts. {For medium range forecasting we find evidence our emulators are more accurate} than the version of the parametrisation scheme that is used for operational predictions. Using the current operational CPU hardware our emulators have a similar computational cost to the existing scheme, but are heavily limited by data movement. On GPU hardware our emulators perform ten times faster than the existing scheme on a CPU.