Anomaly detection in Astrophysics: a comparison between unsupervised Deep and Machine Learning on KiDS data

84 0 0.0 ( 0 )

Download Cite

Added by Stefano Cavuoti

Publication date 2020

fields Physics

and research's language is English

Authors Maurizio DAddona - Giuseppe Riccio - Stefano Cavuoti

Instrumentation and Methods for Astrophysics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Every field of Science is undergoing unprecedented changes in the discovery process, and Astronomy has been a main player in this transition since the beginning. The ongoing and future large and complex multi-messenger sky surveys impose a wide exploiting of robust and efficient automated methods to classify the observed structures and to detect and characterize peculiar and unexpected sources. We performed a preliminary experiment on KiDS DR4 data, by applying to the problem of anomaly detection two different unsupervised machine learning algorithms, considered as potentially promising methods to detect peculiar sources, a Disentangled Convolutional Autoencoder and an Unsupervised Random Forest. The former method, working directly on images, is considered potentially able to identify peculiar objects like interacting galaxies and gravitational lenses. The latter instead, working on catalogue data, could identify objects with unusual values of magnitudes and colours, which in turn could indicate the presence of singularities.

rate research

Systematic Serendipity: A Test of Unsupervised Machine Learning as a Method for Anomaly Detection

107 - Daniel Giles , Lucianne Walkowicz 2018

Advances in astronomy are often driven by serendipitous discoveries. As survey astronomy continues to grow, the size and complexity of astronomical databases will increase, and the ability of astronomers to manually scour data and make such discoveries decreases. In this work, we introduce a machine learning-based method to identify anomalies in large datasets to facilitate such discoveries, and apply this method to long cadence lightcurves from NASAs Kepler Mission. Our method clusters data based on density, identifying anomalies as data that lie outside of dense regions. This work serves as a proof-of-concept case study and we test our method on four quarters of the Kepler long cadence lightcurves. We use Keplers most notorious anomaly, Boyajians Star (KIC 8462852), as a rare `ground truth for testing outlier identification to verify that objects of genuine scientific interest are included among the identified anomalies. We evaluate the methods ability to identify known anomalies by identifying unusual behavior in Boyajians Star, we report the full list of identified anomalies for these quarters, and present a sample subset of identified outliers that includes unusual phenomena, objects that are rare in the Kepler field, and data artifacts. By identifying <4% of each quarter as outlying data, we demonstrate that this anomaly detection method can create a more targeted approach in searching for rare and novel phenomena.

Instrumentation and Methods for Astrophysics Solar and Stellar Astrophysics

Unsupervised Distribution Learning for Lunar Surface Anomaly Detection

360 - Adam Lesnikowski , Valentin T. Bickel , Daniel Angerhausen 2020

In this work we show that modern data-driven machine learning techniques can be successfully applied on lunar surface remote sensing data to learn, in an unsupervised way, sufficiently good representations of the data distribution to enable lunar technosignature and anomaly detection. In particular we train an unsupervised distribution learning neural network model to find the Apollo 15 landing module in a testing dataset, with no dataset specific model or hyperparameter tuning. Sufficiently good unsupervised data density estimation has the promise of enabling myriad useful downstream tasks, including locating lunar resources for future space flight and colonization, finding new impact craters or lunar surface reshaping, and algorithmically deciding the importance of unlabeled samples to send back from power- and bandwidth-constrained missions. We show in this work that such unsupervised learning can be successfully done in the lunar remote sensing and space science contexts.

Instrumentation and Methods for Astrophysics Earth and Planetary Astrophysics Machine Learning

Unsupervised Anomaly Detection on Temporal Multiway Data

222 - Duc Nguyen , Phuoc Nguyen , Kien Do 2020

Temporal anomaly detection looks for irregularities over space-time. Unsupervised temporal models employed thus far typically work on sequences of feature vectors, and much less on temporal multiway data. We focus our investigation on two-way data, in which a data matrix is observed at each time step. Leveraging recent advances in matrix-native recurrent neural networks, we investigated strategies for data arrangement and unsupervised training for temporal multiway anomaly detection. These include compressing-decompressing, encoding-predicting, and temporal data differencing. We conducted a comprehensive suite of experiments to evaluate model behaviors under various settings on synthetic data, moving digits, and ECG recordings. We found interesting phenomena not previously reported. These include the capacity of the compact matrix LSTM to compress noisy data near perfectly, making the strategy of compressing-decompressing data ill-suited for anomaly detection under the noise. Also, long sequence of vectors can be addressed directly by matrix models that allow very long context and multiple step prediction. Overall, the encoding-predicting strategy works very well for the matrix LSTMs in the conducted experiments, thanks to its compactness and better fit to the data dynamics.

Machine Learning Artificial Intelligence

Machine Learning on Difference Image Analysis: A comparison of methods for transient detection

86 - B. Sanchez , M. J. Dominguez R. , M. Lares 2018

We present a comparison of several Difference Image Analysis (DIA) techniques, in combination with Machine Learning (ML) algorithms, applied to the identification of optical transients associated with gravitational wave events. Each technique is assessed based on the scoring metrics of Precision, Recall, and their harmonic mean F1, measured on the DIA results as standalone techniques, and also in the results after the application of ML algorithms, on transient source injections over simulated and real data. This simulations cover a wide range of instrumental configurations, as well as a variety of scenarios of observation conditions, by exploring a multi dimensional set of relevant parameters, allowing us to extract general conclusions related to the identification of transient astrophysical events. The newest subtraction techniques, and particularly the methodology published in Zackay et al. (2016) are implemented in an Open Source Python package, named properimage, suitable for many other astronomical image analyses. This together with the ML libraries we describe, provides an effective transient detection software pipeline. Here we study the effects of the different ML techniques, and the relative feature importances for classification of transient candidates, and propose an optimal combined strategy. This constitutes the basic elements of pipelines that could be applied in searches of electromagnetic counterparts to GW sources.

Instrumentation and Methods for Astrophysics

LRP2020: Machine Learning Advantages in Canadian Astrophysics

61 - K.A. Venn , S. Fabbro , A Liu 2019

The application of machine learning (ML) methods to the analysis of astrophysical datasets is on the rise, particularly as the computing power and complex algorithms become more powerful and accessible. As the field of ML enjoys a continuous stream of breakthroughs, its applications demonstrate the great potential of ML, ranging from achieving tens of millions of times increase in analysis speed (e.g., modeling of gravitational lenses or analysing spectroscopic surveys) to solutions of previously unsolved problems (e.g., foreground subtraction or efficient telescope operations). The number of astronomical publications that include ML has been steadily increasing since 2010. With the advent of extremely large datasets from a new generation of surveys in the 2020s, ML methods will become an indispensable tool in astrophysics. Canada is an unambiguous world leader in the development of the field of machine learning, attracting large investments and skilled researchers to its prestigious AI Research Institutions. This provides a unique opportunity for Canada to also be a world leader in the application of machine learning in the field of astrophysics, and foster the training of a new generation of highly skilled researchers.

Instrumentation and Methods for Astrophysics