No Arabic abstract
We aim to extend and test the classifiers presented in a previous work against an independent dataset. We complement the assessment of the validity of the classifiers by applying them to the set of OGLE light curves treated as variable objects of unknown class. The results are compared to published classification results based on the so-called extractor methods.Two complementary analyses are carried out in parallel. In both cases, the original time series of OGLE observations of the Galactic bulge and Magellanic Clouds are processed in order to identify and characterize the frequency components. In the first approach, the classifiers are applied to the data and the results analyzed in terms of systematic errors and differences between the definition samples in the training set and in the extractor rules. In the second approach, the original classifiers are extended with colour information and, again, applied to OGLE light curves. We have constructed a classification system that can process huge amounts of time series in negligible time and provide reliable samples of the main variability classes. We have evaluated its strengths and weaknesses and provide potential users of the classifier with a detailed description of its characteristics to aid in the interpretation of classification results. Finally, we apply the classifiers to obtain object samples of classes not previously studied in the OGLE database and analyse the results. We pay specific attention to the B-stars in the samples, as their pulsations are strongly dependent on metallicity.
We present an evaluation of the performance of an automated classification of the Hipparcos periodic variable stars into 26 types. The sub-sample with the most reliable variability types available in the literature is used to train supervised algorithms to characterize the type dependencies on a number of attributes. The most useful attributes evaluated with the random forest methodology include, in decreasing order of importance, the period, the amplitude, the V-I colour index, the absolute magnitude, the residual around the folded light-curve model, the magnitude distribution skewness and the amplitude of the second harmonic of the Fourier series model relative to that of the fundamental frequency. Random forests and a multi-stage scheme involving Bayesian network and Gaussian mixture methods lead to statistically equivalent results. In standard 10-fold cross-validation experiments, the rate of correct classification is between 90 and 100%, depending on the variability type. The main mis-classification cases, up to a rate of about 10%, arise due to confusion between SPB and ACV blue variables and between eclipsing binaries, ellipsoidal variables and other variability types. Our training set and the predicted types for the other Hipparcos periodic stars are available online.
We present a novel automated methodology to detect and classify periodic variable stars in a large database of photometric time series. The methods are based on multivariate Bayesian statistics and use a multi-stage approach. We applied our method to the ground-based data of the TrES Lyr1 field, which is also observed by the Kepler satellite, covering ~26000 stars. We found many eclipsing binaries as well as classical non-radial pulsators, such as slowly pulsating B stars, Gamma Doradus, Beta Cephei and Delta Scuti stars. Also a few classical radial pulsators were found.
The exact period determination of a multi-periodic variable star based on its luminosity time series data is believed a task requiring skill and experience. Thus the majority of available time series analysis techniques require human intervention to some extent. The present work is dedicated to establish an automated method of period (or frequency) determination from the time series database of variable stars. Relying on the SigSpec method (Reegen 2007), the technique established here employs a statistically unbiased treatment of frequency-domain noise and avoids spurious (i. e. noise induced) and alias peaks to the highest possible extent. Several add-ons were incorporated to tailor SigSpec to our requirements. We present tests on 386 stars taken from ASAS2 project database. From the output file produced by SigSpec, the frequency with maximum spectral significance is chosen as the genuine frequency. Out of 386 variable stars available in the ASAS2 database, our results contain 243 periods recovered exactly and also 88 half periods, 42 different periods etc. SigSpec has the potential to be effectively used for fully automated period detection from variable stars time series database. The exact detection of periods helps us to identify the type of variability and classify the variable stars, which provides a crucial information on the physical processes effective in stellar atmospheres.
We present a machine learning package for the classification of periodic variable stars. Our package is intended to be general: it can classify any single band optical light curve comprising at least a few tens of observations covering durations from weeks to years, with arbitrary time sampling. We use light curves of periodic variable stars taken from OGLE and EROS-2 to train the model. To make our classifier relatively survey-independent, it is trained on 16 features extracted from the light curves (e.g. period, skewness, Fourier amplitude ratio). The model classifies light curves into one of seven superclasses - Delta Scuti, RR Lyrae, Cepheid, Type II Cepheid, eclipsing binary, long-period variable, non-variable - as well as subclasses of these, such as ab, c, d, and e types for RR Lyraes. When trained to give only superclasses, our model achieves 0.98 for both recall and precision as measured on an independent validation dataset (on a scale of 0 to 1). When trained to give subclasses, it achieves 0.81 for both recall and precision. In order to assess classification performance of the subclass model, we applied it to the MACHO, LINEAR, and ASAS periodic variables, which gave recall/precision of 0.92/0.98, 0.89/0.96, and 0.84/0.88, respectively. We also applied the subclass model to Hipparcos periodic variable stars of many other variability types that do not exist in our training set, in order to examine how much those types degrade the classification performance of our target classes. In addition, we investigate how the performance varies with the number of data points and duration of observations. We find that recall and precision do not vary significantly if the number of data points is larger than 80 and the duration is more than a few weeks. The classifier software of the subclass model is available from the GitHub repository (https://goo.gl/xmFO6Q).
With growing data volumes from synoptic surveys, astronomers must become more abstracted from the discovery and introspection processes. Given the scarcity of follow-up resources, there is a particularly sharp onus on the frameworks that replace these human roles to provide accurate and well-calibrated probabilistic classification catalogs. Such catalogs inform the subsequent follow-up, allowing consumers to optimize the selection of specific sources for further study and permitting rigorous treatment of purities and efficiencies for population studies. Here, we describe a process to produce a probabilistic classification catalog of variability with machine learning from a multi-epoch photometric survey. In addition to producing accurate classifications, we show how to estimate calibrated class probabilities, and motivate the importance of probability calibration. We also introduce a methodology for feature-based anomaly detection, which allows discovery of objects in the survey that do not fit within the predefined class taxonomy. Finally, we apply these methods to sources observed by the All Sky Automated Survey (ASAS), and unveil the Machine-learned ASAS Classification Catalog (MACC), which is a 28-class probabilistic classification catalog of 50,124 ASAS sources. We estimate that MACC achieves a sub-20% classification error rate, and demonstrate that the class posterior probabilities are reasonably calibrated. MACC classifications compare favorably to the classifications of several previous domain-specific ASAS papers and to the ASAS Catalog of Variable Stars, which had classified only 24% of those sources into one of 12 science classes. The MACC is publicly available at http://www.bigmacc.info.