No Arabic abstract
Due to the ever-expanding volume of observed spectroscopic data from surveys such as SDSS and LAMOST, it has become important to apply artificial intelligence (AI) techniques for analysing stellar spectra to solve spectral classification and regression problems like the determination of stellar atmospheric parameters Teff, log g, [Fe/H]. We propose an automated approach for the classification of stellar spectra in the optical region using Convolutional Neural Networks. Traditional machine learning (ML) methods with shallow architecture (usually up to 2 hidden layers) have been trained for these purposes in the past. However, deep learning methods with a larger number of hidden layers allow the use of finer details in the spectrum which results in improved accuracy and better generalisation. Studying finer spectral signatures also enables us to determine accurate differential stellar parameters and find rare objects. We examine various machine and deep learning algorithms like Artificial Neural Networks (ANN), Random Forest (RF), and Convolutional Neural Network (CNN) to classify stellar spectra using the Jacoby Atlas, ELODIE and MILES spectral libraries as training samples. We test the performance of the trained networks on the Indo-U.S. Library of Coude Feed Stellar Spectra (CFLIB). We show that using convolutional neural networks, we are able to lower the error up to 1.23 spectral sub-classes as compared to that of 2 sub-classes achieved in the past studies with ML approach. We further apply the trained model to classify stellar spectra retrieved from the SDSS database with SNR>20.
Discrete Fourier transforms provide a significant speedup in the computation of convolutions in deep learning. In this work, we demonstrate that, beyond its advantages for efficient computation, the spectral domain also provides a powerful representation in which to model and train convolutional neural networks (CNNs). We employ spectral representations to introduce a number of innovations to CNN design. First, we propose spectral pooling, which performs dimensionality reduction by truncating the representation in the frequency domain. This approach preserves considerably more information per parameter than other pooling strategies and enables flexibility in the choice of pooling output dimensionality. This representation also enables a new form of stochastic regularization by randomized modification of resolution. We show that these methods achieve competitive results on classification and approximation tasks, without using any dropout or max-pooling. Finally, we demonstrate the effectiveness of complex-coefficient spectral parameterization of convolutional filters. While this leaves the underlying model unchanged, it results in a representation that greatly facilitates optimization. We observe on a variety of popular CNN configurations that this leads to significantly faster convergence during training.
Vetting of exoplanet candidates in transit surveys is a manual process, which suffers from a large number of false positives and a lack of consistency. Previous work has shown that Convolutional Neural Networks (CNN) provide an efficient solution to these problems. Here, we apply a CNN to classify planet candidates from the Next Generation Transit Survey (NGTS). For training datasets we compare both real data with injected planetary transits and fully-simulated data, as well as how their different compositions affect network performance. We show that fewer hand labelled lightcurves can be utilised, while still achieving competitive results. With our best model, we achieve an AUC (area under the curve) score of $(95.6pm{0.2})%$ and an accuracy of $(88.5pm{0.3})%$ on our unseen test data, as well as $(76.5pm{0.4})%$ and $(74.6pm{1.1})%$ in comparison to our existing manual classifications. The neural network recovers 13 out of 14 confirmed planets observed by NGTS, with high probability. We use simulated data to show that the overall network performance is resilient to mislabelling of the training dataset, a problem that might arise due to unidentified, low signal-to-noise transits. Using a CNN, the time required for vetting can be reduced by half, while still recovering the vast majority of manually flagged candidates. In addition, we identify many new candidates with high probabilities which were not flagged by human vetters.
We propose two deep neural network architectures for classification of arbitrary-length electrocardiogram (ECG) recordings and evaluate them on the atrial fibrillation (AF) classification data set provided by the PhysioNet/CinC Challenge 2017. The first architecture is a deep convolutional neural network (CNN) with averaging-based feature aggregation across time. The second architecture combines convolutional layers for feature extraction with long-short term memory (LSTM) layers for temporal aggregation of features. As a key ingredient of our training procedure we introduce a simple data augmentation scheme for ECG data and demonstrate its effectiveness in the AF classification task at hand. The second architecture was found to outperform the first one, obtaining an $F_1$ score of $82.1$% on the hidden challenge testing set.
We introduce a convolutional recurrent neural network (CRNN) for music tagging. CRNNs take advantage of convolutional neural networks (CNNs) for local feature extraction and recurrent neural networks for temporal summarisation of the extracted features. We compare CRNN with three CNN structures that have been used for music tagging while controlling the number of parameters with respect to their performance and training time per sample. Overall, we found that CRNNs show a strong performance with respect to the number of parameter and training time, indicating the effectiveness of its hybrid structure in music feature extraction and feature summarisation.
The NASA Transiting Exoplanet Survey Satellite (TESS) is observing tens of millions of stars with time spans ranging from $sim$ 27 days to about 1 year of continuous observations. This vast amount of data contains a wealth of information for variability, exoplanet, and stellar astrophysics studies but requires a number of processing steps before it can be fully utilized. In order to efficiently process all the TESS data and make it available to the wider scientific community, the TESS Data for Asteroseismology working group, as part of the TESS Asteroseismic Science Consortium, has created an automated open-source processing pipeline to produce light curves corrected for systematics from the short- and long-cadence raw photometry data and to classify these according to stellar variability type. We will process all stars down to a TESS magnitude of 15. This paper is the next in a series detailing how the pipeline works. Here, we present our methodology for the automatic variability classification of TESS photometry using an ensemble of supervised learners that are combined into a metaclassifier. We successfully validate our method using a carefully constructed labelled sample of Kepler Q9 light curves with a 27.4 days time span mimicking single-sector TESS observations, on which we obtain an overall accuracy of 94.9%. We demonstrate that our methodology can successfully classify stars outside of our labeled sample by applying it to all $sim$ 167,000 stars observed in Q9 of the Kepler space mission.