No Arabic abstract
Solar flares produce radiation which can have an almost immediate effect on the near-Earth environment, making it crucial to forecast flares in order to mitigate their negative effects. The number of published approaches to flare forecasting using photospheric magnetic field observations has proliferated, with varying claims about how well each works. Because of the different analysis techniques and data sets used, it is essentially impossible to compare the results from the literature. This problem is exacerbated by the low event rates of large solar flares. The challenges of forecasting rare events have long been recognized in the meteorology community, but have yet to be fully acknowledged by the space weather community. During the interagency workshop on all clear forecasts held in Boulder, CO in 2009, the performance of a number of existing algorithms was compared on common data sets, specifically line-of-sight magnetic field and continuum intensity images from MDI, with consistent definitions of what constitutes an event. We demonstrate the importance of making such systematic comparisons, and of using standard verification statistics to determine what constitutes a good prediction scheme. When a comparison was made in this fashion, no one method clearly outperformed all others, which may in part be due to the strong correlations among the parameters used by different methods to characterize an active region. For M-class flares and above, the set of methods tends towards a weakly positive skill score (as measured with several distinct metrics), with no participating method proving substantially better than climatological forecasts.
Solar flares are extremely energetic phenomena in our Solar System. Their impulsive, often drastic radiative increases, in particular at short wavelengths, bring immediate impacts that motivate solar physics and space weather research to understand solar flares to the point of being able to forecast them. As data and algorithms improve dramatically, questions must be asked concerning how well the forecasting performs; crucially, we must ask how to rigorously measure performance in order to critically gauge any improvements. Building upon earlier-developed methodology (Barnes et al, 2016, Paper I), international representatives of regional warning centers and research facilities assembled in 2017 at the Institute for Space-Earth Environmental Research, Nagoya University, Japan to - for the first time - directly compare the performance of operational solar flare forecasting methods. Multiple quantitative evaluation metrics are employed, with focus and discussion on evaluation methodologies given the restrictions of operational forecasting. Numerous methods performed consistently above the no skill level, although which method scored top marks is decisively a function of flare event definition and the metric used; there was no single winner. Following in this paper series we ask why the performances differ by examining implementation details (Leka et al. 2019, Paper III), and then we present a novel analysis method to evaluate temporal patterns of forecasting errors in (Park et al. 2019, Paper IV). With these works, this team presents a well-defined and robust methodology for evaluating solar flare forecasting methods in both research and operational frameworks, and todays performance benchmarks against which improvements and new methods may be compared.
A workshop was recently held at Nagoya University (31 October - 02 November 2017), sponsored by the Center for International Collaborative Research, at the Institute for Space-Earth Environmental Research, Nagoya University, Japan, to quantitatively compare the performance of todays operational solar flare forecasting facilities. Building upon Paper I of this series (Barnes et al. 2016), in Paper II (Leka et al. 2019) we described the participating methods for this latest comparison effort, the evaluation methodology, and presented quantitative comparisons. In this paper we focus on the behavior and performance of the methods when evaluated in the context of broad implementation differences. Acknowledging the short testing interval available and the small number of methods available, we do find that forecast performance: 1) appears to improve by including persistence or prior flare activity, region evolution, and a human forecaster in the loop; 2) is hurt by restricting data to disk-center observations; 3) may benefit from long-term statistics, but mostly when then combined with modern data sources and statistical approaches. These trends are arguably weak and must be viewed with numerous caveats, as discussed both here and in Paper II. Following this present work, we present in Paper IV a novel analysis method to evaluate temporal patterns of forecasting errors of both types (i.e., misses and false alarms; Park et al. 2019). Hence, most importantly, with this series of papers we demonstrate the techniques for facilitating comparisons in the interest of establishing performance-positive methodologies.
A crucial challenge to successful flare prediction is forecasting periods that transition between flare-quiet and flare-active. Building on earlier studies in this series (Barnes et al. 2016; Leka et al. 2019a,b) in which we describe methodology, details, and results of flare forecasting comparison efforts, we focus here on patterns of forecast outcomes (success and failure) over multi-day periods. A novel analysis is developed to evaluate forecasting success in the context of catching the first event of flare-active periods, and conversely, of correctly predicting declining flare activity. We demonstrate these evaluation methods graphically and quantitatively as they provide both quick comparative evaluations and options for detailed analysis. For the testing interval 2016-2017, we determine the relative frequency distribution of two-day dichotomous forecast outcomes for three different event histories (i.e., event/event, no-event/event and event/no-event), and use it to highlight performance differences between forecasting methods. A trend is identified across all forecasting methods that a high/low forecast probability on day-1 remains high/low on day-2 even though flaring activity is transitioning. For M-class and larger flares, we find that explicitly including persistence or prior flare history in computing forecasts helps to improve overall forecast performance. It is also found that using magnetic/modern data leads to improvement in catching the first-event/first-no-event transitions. Finally, 15% of major (i.e., M-class or above) flare days over the testing interval were effectively missed due to a lack of observations from instruments away from the Earth-Sun line.
Solar Energetic Particle events (SEPs) are among the most dangerous transient phenomena of solar activity. As hazardous radiation, SEPs may affect the health of astronauts in outer space and adversely impact current and future space exploration. In this paper, we consider the problem of daily prediction of Solar Proton Events (SPEs) based on the characteristics of the magnetic fields in solar Active Regions (ARs), preceding soft X-ray and proton fluxes, and statistics of solar radio bursts. The machine learning (ML) algorithm uses an artificial neural network of custom architecture designed for whole-Sun input. The predictions of the ML model are compared with the SWPC NOAA operational forecasts of SPEs. Our preliminary results indicate that 1) for the AR-based predictions, it is necessary to take into account ARs at the western limb and on the far side of the Sun; 2) characteristics of the preceding proton flux represent the most valuable input for prediction; 3) daily median characteristics of ARs and the counts of type II, III, and IV radio bursts may be excluded from the forecast without performance loss; and 4) ML-based forecasts outperform SWPC NOAA forecasts in situations in which missing SPE events is very undesirable. The introduced approach indicates the possibility of developing robust all-clear SPE forecasts by employing machine learning methods.
The EU funded the FLARECAST project, that ran from Jan 2015 until Feb 2018. FLARECAST had a R2O focus, and introduced several innovations into the discipline of solar flare forecasting. FLARECAST innovations were: first, the treatment of hundreds of physical properties viewed as promising flare predictors on equal footing, extending multiple previous works; second, the use of fourteen (14) different ML techniques, also on equal footing, to optimize the immense Big Data parameter space created by these many predictors; third, the establishment of a robust, three-pronged communication effort oriented toward policy makers, space-weather stakeholders and the wider public. FLARECAST pledged to make all its data, codes and infrastructure openly available worldwide. The combined use of 170+ properties (a total of 209 predictors are now available) in multiple ML algorithms, some of which were designed exclusively for the project, gave rise to changing sets of best-performing predictors for the forecasting of different flaring levels. At the same time, FLARECAST reaffirmed the importance of rigorous training and testing practices to avoid overly optimistic pre-operational prediction performance. In addition, the project has (a) tested new and revisited physically intuitive flare predictors and (b) provided meaningful clues toward the transition from flares to eruptive flares, namely, events associated with coronal mass ejections (CMEs). These leads, along with the FLARECAST data, algorithms and infrastructure, could help facilitate integrated space-weather forecasting efforts that take steps to avoid effort duplication. In spite of being one of the most intensive and systematic flare forecasting efforts to-date, FLARECAST has not managed to convincingly lift the barrier of stochasticity in solar flare occurrence and forecasting: solar flare prediction thus remains inherently probabilistic.