No Arabic abstract
In the new era of very large telescopes, where data is crucial to expand scientific knowledge, we have witnessed many deep learning applications for the automatic classification of lightcurves. Recurrent neural networks (RNNs) are one of the models used for these applications, and the LSTM unit stands out for being an excellent choice for the representation of long time series. In general, RNNs assume observations at discrete times, which may not suit the irregular sampling of lightcurves. A traditional technique to address irregular sequences consists of adding the sampling time to the networks input, but this is not guaranteed to capture sampling irregularities during training. Alternatively, the Phased LSTM unit has been created to address this problem by updating its state using the sampling times explicitly. In this work, we study the effectiveness of the LSTM and Phased LSTM based architectures for the classification of astronomical lightcurves. We use seven catalogs containing periodic and nonperiodic astronomical objects. Our findings show that LSTM outperformed PLSTM on 6/7 datasets. However, the combination of both units enhances the results in all datasets.
We apply the technique of self-organising maps (Kohonen 1990) to the automated classification of singly periodic astronomical lightcurves. We find that our maps readily distinguish between lightcurve types in both synthetic and real datasets, and that the resulting maps do not depend sensitively on the chosen learning parameters. Automated data analysis techniques are likely to be become increasingly important as the size of astronomical datasets continues to increase, particularly with the advent of ultra-wide-field survey telescopes such as WASP, RAPTOR and ASAS.
We present an automatic classification method for astronomical catalogs with missing data. We use Bayesian networks, a probabilistic graphical model, that allows us to perform inference to pre- dict missing values given observed data and dependency relationships between variables. To learn a Bayesian network from incomplete data, we use an iterative algorithm that utilises sampling methods and expectation maximization to estimate the distributions and probabilistic dependencies of variables from data with missing values. To test our model we use three catalogs with missing data (SAGE, 2MASS and UBVI) and one complete catalog (MACHO). We examine how classification accuracy changes when information from missing data catalogs is included, how our method compares to traditional missing data approaches and at what computational cost. Integrating these catalogs with missing data we find that classification of variable objects improves by few percent and by 15% for quasar detection while keeping the computational cost the same.
ASTRONIRCAM is an infrared camera-spectrograph installed at the 2.5-meter telescope of the CMO SAI. The instrument is equipped with the HAWAII-2RG array. A bad pixels classification of the ASTRONIRCAM detector is proposed. The classification is based on histograms of the difference of consecutive non-destructive readouts of a flat field. Bad pixels are classified into 5 groups: hot (saturated on the first readout), warm (the signal accumulation rate is above the mean value by more than 5 standard deviations), cold (the rate is under the mean value by more than 5 standard deviations), dead (no signal accumulation), and inverse (having a negative signal accumulation in the first readouts). Normal pixels of the ASTRONIRCAM detector account for 99.6% of the total. We investigated the dependence between the amount of bad pixels and the number of cooldown cycles of the instrument. While hot pixels remain the same, the bad pixels of other types may migrate between groups. The number of pixels in each group stays roughly constant. We found that the mean and variance of the bad pixels amount in each group and the transitions between groups do not differ noticeably between normal or slow cooldowns.
Next-generation surveys like the Legacy Survey of Space and Time (LSST) on the Vera C. Rubin Observatory will generate orders of magnitude more discoveries of transients and variable stars than previous surveys. To prepare for this data deluge, we developed the Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC), a competition which aimed to catalyze the development of robust classifiers under LSST-like conditions of a non-representative training set for a large photometric test set of imbalanced classes. Over 1,000 teams participated in PLAsTiCC, which was hosted in the Kaggle data science competition platform between Sep 28, 2018 and Dec 17, 2018, ultimately identifying three winners in February 2019. Participants produced classifiers employing a diverse set of machine learning techniques including hybrid combinations and ensemble averages of a range of approaches, among them boosted decision trees, neural networks, and multi-layer perceptrons. The strong performance of the top three classifiers on Type Ia supernovae and kilonovae represent a major improvement over the current state-of-the-art within astronomy. This paper summarizes the most promising methods and evaluates their results in detail, highlighting future directions both for classifier development and simulation needs for a next generation PLAsTiCC data set.
The exploitation of present and future synoptic (multi-band and multi-epoch) surveys requires an extensive use of automatic methods for data processing and data interpretation. In this work, using data extracted from the Catalina Real Time Transient Survey (CRTS), we investigate the classification performance of some well tested methods: Random Forest, MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) and K-Nearest Neighbors, paying special attention to the feature selection phase. In order to do so, several classification experiments were performed. Namely: identification of cataclysmic variables, separation between galactic and extra-galactic objects and identification of supernovae.