Solar Flares Forecasting Using Time Series and Extreme Gradient Boosting Ensembles


Abstract in English

Space weather events may cause damage to several fields, including aviation, satellites, oil and gas industries, and electrical systems, leading to economic and commercial losses. Solar flares are one of the most significant events, and refer to sudden radiation releases that can affect the Earths atmosphere within a few hours or minutes. Therefore, it is worth designing high-performance systems for forecasting such events. Although in the literature there are many approaches for flare forecasting, there is still a lack of consensus concerning the techniques used for designing these systems. Seeking to establish some standardization while designing flare predictors, in this study we propose a novel methodology for designing such predictors, further validated with extreme gradient boosting tree classifiers and time series. This methodology relies on the following well-defined machine learning based pipeline: (i) univariate feature selection; (ii) randomized hyper-parameter optimization; (iii) imbalanced data treatment; (iv) adjustment of cut-off point of classifiers; and (v) evaluation under operational settings. To verify our methodology effectiveness, we designed and evaluated three proof-of-concept models for forecasting $geq C$ class flares up to 72 hours ahead. Compared to baseline models, those models were able to significantly increase their scores of true skill statistics (TSS) under operational forecasting scenarios by 0.37 (predicting flares in the next 24 hours), 0.13 (predicting flares within 24-48 hours), and 0.36 (predicting flares within 48-72 hours). Besides increasing TSS, the methodology also led to significant increases in the area under the ROC curve, corroborating that we improved the positive and negative recalls of classifiers while decreasing the number of false alarms.

Download