No Arabic abstract
We apply classical machine vision and machine deep learning methods to prototype signal classifiers for the search for extraterrestrial intelligence. Our novel approach uses two-dimensional spectrograms of measured and simulated radio signals bearing the imprint of a technological origin. The studies are performed using archived narrow-band signal data captured from real-time SETI observations with the Allen Telescope Array and a set of digitally simulated signals designed to mimic real observed signals. By treating the 2D spectrogram as an image, we show that high quality parametric and non-parametric classifiers based on automated visual analysis can achieve high levels of discrimination and accuracy, as well as low false-positive rates. The (real) archived data were subjected to numerous feature-extraction algorithms based on the vertical and horizontal image moments and Huff transforms to simulate feature rotation. The most successful algorithm used a two-step process where the image was first filtered with a rotation, scale and shift-invariant affine transform followed by a simple correlation with a previously defined set of labeled prototype examples. The real data often contained multiple signals and signal ghosts, so we performed our non-parametric evaluation using a simpler and more controlled dataset produced by simulation of complex-valued voltage data with properties similar to the observed prototypes. The most successful non-parametric classifier employed a wide residual (convolutional) neural network based on pre-existing classifiers in current use for object detection in ordinary photographs. These results are relevant to a wide variety of research domains that already employ spectrogram analysis from time-domain astronomy to observations of earthquakes to animal vocalization analysis.
We investigate star-galaxy classification for astronomical surveys in the context of four methods enabling the interpretation of black-box machine learning systems. The first is outputting and exploring the decision boundaries as given by decision tree based methods, which enables the visualization of the classification categories. Secondly, we investigate how the Mutual Information based Transductive Feature Selection (MINT) algorithm can be used to perform feature pre-selection. If one would like to provide only a small number of input features to a machine learning classification algorithm, feature pre-selection provides a method to determine which of the many possible input properties should be selected. Third is the use of the tree-interpreter package to enable popular decision tree based ensemble methods to be opened, visualized, and understood. This is done by additional analysis of the tree based model, determining not only which features are important to the model, but how important a feature is for a particular classification given its value. Lastly, we use decision boundaries from the model to revise an already existing method of classification, essentially asking the tree based method where decision boundaries are best placed and defining a new classification method. We showcase these techniques by applying them to the problem of star-galaxy separation using data from the Sloan Digital Sky Survey (hereafter SDSS). We use the output of MINT and the ensemble methods to demonstrate how more complex decision boundaries improve star-galaxy classification accuracy over the standard SDSS frames approach (reducing misclassifications by up to $approx33%$). We then show how tree-interpreter can be used to explore how relevant each photometric feature is when making a classification on an object by object basis.
In the spirit of Trimbles ``Astrophysics in XXXX series, I very briefly and subjectively review developments in SETI in 2020. My primary focus is 74 papers and books published or made public in 2020, which I sort into six broad categories: results from actual searches, new search methods and instrumentation, target and frequency seleciton, the development of technosignatures, theory of ETIs, and social aspects of SETI.
Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques fitting parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieves an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.
We demonstrate the application of a convolutional neural network to the gravitational wave signals from core collapse supernovae. Using simulated time series of gravitational wave detectors, we show that based on the explosion mechanisms, a convolutional neural network can be used to detect and classify the gravitational wave signals buried in noise. For the waveforms used in the training of the convolutional neural network, our results suggest that a network of advanced LIGO, advanced VIRGO and KAGRA, or a network of LIGO A+, advanced VIRGO and KAGRA is likely to detect a magnetorotational core collapse supernovae within the Large and Small Magellanic Clouds, or a Galactic event if the explosion mechanism is the neutrino-driven mechanism. By testing the convolutional neural network with waveforms not used for training, we show that the true alarm probabilities are 52% and 83% at 60 kpc for waveforms R3E1AC and R4E1FC L. For waveforms s20 and SFHx at 10 kpc, the true alarm probabilities are 70% and 93% respectively. All at false alarm probability equal to 10%.
We show that multiple machine learning algorithms can match human performance in classifying transient imaging data from the Sloan Digital Sky Survey (SDSS) supernova survey into real objects and artefacts. This is a first step in any transient science pipeline and is currently still done by humans, but future surveys such as the Large Synoptic Survey Telescope (LSST) will necessitate fully machine-enabled solutions. Using features trained from eigenimage analysis (principal component analysis, PCA) of single-epoch g, r and i-difference images, we can reach a completeness (recall) of 96 per cent, while only incorrectly classifying at most 18 per cent of artefacts as real objects, corresponding to a precision (purity) of 84 per cent. In general, random forests performed best, followed by the k-nearest neighbour and the SkyNet artificial neural net algorithms, compared to other methods such as naive Bayes and kernel support vector machine. Our results show that PCA-based machine learning can match human success levels and can naturally be extended by including multiple epochs of data, transient colours and host galaxy information which should allow for significant further improvements, especially at low signal-to-noise.