No Arabic abstract
This study investigated an approach to improve the accuracy of computationally lightweight surrogate models by updating forecasts based on historical accuracy relative to sparse observation data. Using a lightweight, ocean-wave forecasting model, we created a large number of model ensembles, with perturbed inputs, for a two-year study period. Forecasts were aggregated using a machine-learning algorithm that combined forecasts from multiple, independent models into a single best-estimate prediction of the true state. The framework was applied to a case-study site in Monterey Bay, California. A~learning-aggregation technique used historical observations and model forecasts to calculate a weight for each ensemble member. Weighted ensemble predictions were compared to measured wave conditions to evaluate performance against present state-of-the-art. Finally, we discussed how this framework, which integrates ensemble aggregations and surrogate models, can be used to improve forecasting systems and further enable scientific process studies.
In situ and remotely sensed observations have potential to facilitate data-driven predictive models for oceanography. A suite of machine learning models, including regression, decision tree and deep learning approaches were developed to estimate sea surface temperatures (SST). Training data consisted of satellite-derived SST and atmospheric data from The Weather Company. Models were evaluated in terms of accuracy and computational complexity. Predictive skill were assessed against observations and a state-of-the-art, physics-based model from the European Centre for Medium Weather Forecasting. Results demonstrated that by combining automated feature engineering with machine-learning approaches, accuracy comparable to existing state-of-the-art can be achieved. Models captured seasonal patterns in the data and qualitatively reproduce short-term variations driven by atmospheric forcing. Further, it demonstrated that machine-learning-based approaches can be used as transportable prediction tools for ocean variables -- the data-driven nature of the approach naturally integrates with automatic deployment frameworks, where model deployments are guided by data rather than user-parametrisation and expertise. The low computational cost of inference makes the approach particularly attractive for edge-based computing where predictive models could be deployed on low-power devices in the marine environment.
Through ensemble-based data assimilation (DA), we address one of the most notorious difficulties in phase-resolved ocean wave forecast, regarding the deviation of numerical solution from the true surface elevation due to the chaotic nature of and underrepresented physics in the nonlinear wave models. In particular, we develop a coupled approach of the high-order spectral (HOS) method with the ensemble Kalman filter (EnKF), through which the measurement data can be incorporated into the simulation to improve the forecast performance. A unique feature in this coupling is the mismatch between the predictable zone and measurement region, which is accounted for through a special algorithm to modify the analysis equation in EnKF. We test the performance of the new EnKF-HOS method using both synthetic data and real radar measurements. For both cases (though differing in details), it is shown that the new method achieves much higher accuracy than the HOS-only method, and can retain the phase information of an irregular wave field for arbitrarily long forecast time with sequentially assimilated data.
A~machine learning framework is developed to estimate ocean-wave conditions. By supervised training of machine learning models on many thousands of iterations of a physics-based wave model, accurate representations of significant wave heights and period can be used to predict ocean conditions. A model of Monterey Bay was used as the example test site; it was forced by measured wave conditions, ocean-current nowcasts, and reported winds. These input data along with model outputs of spatially variable wave heights and characteristic period were aggregated into supervised learning training and test data sets, which were supplied to machine learning models. These machine learning models replicated wave heights with a root-mean-squared error of 9cm and correctly identify over 90% of the characteristic periods for the test-data sets. Impressively, transforming model inputs to outputs through matrix operations requires only a fraction (<1/1,000) of the computation time compared to forecasting with the physics-based model.
Tropical cyclones are one of the most powerful and destructive natural phenomena on earth. Tropical storms and heavy rains can cause floods, which lead to human lives and economic loss. Devastating winds accompanying cyclones heavily affect not only the coastal regions, even distant areas. Our study focuses on the intensity estimation, particularly cyclone grade and maximum sustained surface wind speed (MSWS) of a tropical cyclone over the North Indian Ocean. We use various machine learning algorithms to estimate cyclone grade and MSWS. We have used the basin of origin, date, time, latitude, longitude, estimated central pressure, and pressure drop as attributes of our models. We use multi-class classification models for the categorical outcome variable, cyclone grade, and regression models for MSWS as it is a continuous variable. Using the best track data of 28 years over the North Indian Ocean, we estimate grade with an accuracy of 88% and MSWS with a root mean square error (RMSE) of 2.3. For higher grade categories (5-7), accuracy improves to an average of 98.84%. We tested our model with two recent tropical cyclones in the North Indian Ocean, Vayu and Fani. For grade, we obtained an accuracy of 93.22% and 95.23% respectively, while for MSWS, we obtained RMSE of 2.2 and 3.4 and $R^2$ of 0.99 and 0.99, respectively.
Progress within physical oceanography has been concurrent with the increasing sophistication of tools available for its study. The incorporation of machine learning (ML) techniques offers exciting possibilities for advancing the capacity and speed of established methods and also for making substantial and serendipitous discoveries. Beyond vast amounts of complex data ubiquitous in many modern scientific fields, the study of the ocean poses a combination of unique challenges that ML can help address. The observational data available is largely spatially sparse, limited to the surface, and with few time series spanning more than a handful of decades. Important timescales span seconds to millennia, with strong scale interactions and numerical modelling efforts complicated by details such as coastlines. This review covers the current scientific insight offered by applying ML and points to where there is imminent potential. We cover the main three branches of the field: observations, theory, and numerical modelling. Highlighting both challenges and opportunities, we discuss both the historical context and salient ML tools. We focus on the use of ML in situ sampling and satellite observations, and the extent to which ML applications can advance theoretical oceanographic exploration, as well as aid numerical simulations. Applications that are also covered include model error and bias correction and current and potential use within data assimilation. While not without risk, there is great interest in the potential benefits of oceanographic ML applications; this review caters to this interest within the research community.