No Arabic abstract
With the accumulation of big data of CME observations by coronagraphs, automatic detection and tracking of CMEs has proven to be crucial. The excellent performance of convolutional neural network in image classification, object detection and other computer vision tasks motivates us to apply it to CME detection and tracking as well. We have developed a new tool for CME Automatic detection and tracking with MachinE Learning (CAMEL) techniques. The system is a three-module pipeline. It is first a supervised image classification problem. We solve it by training a neural network LeNet with training labels obtained from an existing CME catalog. Those images containing CME structures are flagged as CME images. Next, to identify the CME region in each CME-flagged image, we use deep descriptor transforming to localize the common object in an image set. A following step is to apply the graph cut technique to finely tune the detected CME region. To track the CME in an image sequence, the binary images with detected CME pixels are converted from cartesian to polar coordinate. A CME event is labeled if it can move in at least two frames and reach the edge of coronagraph field of view. For each event, a few fundamental parameters are derived. The results of four representative CMEs with various characteristics are presented and compared with those from four existing automatic and manual catalogs. We find that CAMEL can detect more complete and weaker structures, and has better performance to catch a CME as early as possible.
As automated vehicles are getting closer to becoming a reality, it will become mandatory to be able to characterise the performance of their obstacle detection systems. This validation process requires large amounts of ground-truth data, which is currently generated by manually annotation. In this paper, we propose a novel methodology to generate ground-truth kinematics datasets for specific objects in real-world scenes. Our procedure requires no annotation whatsoever, human intervention being limited to sensors calibration. We present the recording platform which was exploited to acquire the reference data and a detailed and thorough analytical study of the propagation of errors in our procedure. This allows us to provide detailed precision metrics for each and every data item in our datasets. Finally some visualisations of the acquired data are given.
In order to understand stellar evolution, it is crucial to efficiently determine stellar surface rotation periods. An efficient tool to automatically determine reliable rotation periods is needed when dealing with large samples of stellar photometric datasets. The objective of this work is to develop such a tool. Random forest learning abilities are exploited to automate the extraction of rotation periods in Kepler light curves. Rotation periods and complementary parameters are obtained from three different methods: a wavelet analysis, the autocorrelation function of the light curve, and the composite spectrum. We train three different classifiers: one to detect if rotational modulations are present in the light curve, one to flag close binary or classical pulsators candidates that can bias our rotation period determination, and finally one classifier to provide the final rotation period. We test our machine learning pipeline on 23,431 stars of the Kepler K and M dwarf reference rotation catalog of Santos et al. (2019) for which 60% of the stars have been visually inspected. For the sample of 21,707 stars where all the input parameters are provided to the algorithm, 94.2% of them are correctly classified (as rotating or not). Among the stars that have a rotation period in the reference catalog, the machine learning provides a period that agrees within 10% of the reference value for 95.3% of the stars. Moreover, the yield of correct rotation periods is raised to 99.5% after visually inspecting 25.2% of the stars. Over the two main analysis steps, rotation classification and period selection, the pipeline yields a global agreement with the reference values of 92.1% and 96.9% before and after visual inspection. Random forest classifiers are efficient tools to determine reliable rotation periods in large samples of stars. [abridged]
Scenario optimization is by now a well established technique to perform designs in the presence of uncertainty. It relies on domain knowledge integrated with first-hand information that comes from data and generates solutions that are also accompanied by precise statements of reliability. In this paper, following recent developments in (Garatti and Campi, 2019), we venture beyond the traditional set-up of scenario optimization by analyzing the concept of constraints relaxation. By a solid theoretical underpinning, this new paradigm furnishes fundamental tools to perform designs that meet a proper compromise between robustness and performance. After suitably expanding the scope of constraints relaxation as proposed in (Garatti and Campi, 2019), we focus on various classical Support Vector methods in machine learning - including SVM (Support Vector Machine), SVR (Support Vector Regression) and SVDD (Support Vector Data Description) - and derive new results for the ability of these methods to generalize.
Aims. The derivation of spectroscopic parameters for M dwarf stars is very important in the fields of stellar and exoplanet characterization. The goal of this work is the creation of an automatic computational tool, able to derive quickly and reliably the T$_{mathrm{eff}}$ and [Fe/H] of M dwarfs by using their optical spectra, that can be obtained by different spectrographs with different resolutions. Methods. ODUSSEAS (Observing Dwarfs Using Stellar Spectroscopic Energy-Absorption Shapes) is based on the measurement of the pseudo equivalent widths for more than 4000 stellar absorption lines and on the use of the machine learning Python package scikit-learn for predicting the stellar parameters. Results. We show that our tool is able to derive parameters accurately and with high precision, having precision errors of ~30 K for T$_{mathrm{eff}}$ and ~0.04 dex for [Fe/H]. The results are consistent for spectra with resolutions between 48000 and 115000 and SNR above 20.
Automated machine learning (AutoML) aims to find optimal machine learning solutions automatically given a machine learning problem. It could release the burden of data scientists from the multifarious manual tuning process and enable the access of domain experts to the off-the-shelf machine learning solutions without extensive experience. In this paper, we review the current developments of AutoML in terms of three categories, automated feature engineering (AutoFE), automated model and hyperparameter learning (AutoMHL), and automated deep learning (AutoDL). State-of-the-art techniques adopted in the three categories are presented, including Bayesian optimization, reinforcement learning, evolutionary algorithm, and gradient-based approaches. We summarize popular AutoML frameworks and conclude with current open challenges of AutoML.