No Arabic abstract
With growing data volumes from synoptic surveys, astronomers must become more abstracted from the discovery and introspection processes. Given the scarcity of follow-up resources, there is a particularly sharp onus on the frameworks that replace these human roles to provide accurate and well-calibrated probabilistic classification catalogs. Such catalogs inform the subsequent follow-up, allowing consumers to optimize the selection of specific sources for further study and permitting rigorous treatment of purities and efficiencies for population studies. Here, we describe a process to produce a probabilistic classification catalog of variability with machine learning from a multi-epoch photometric survey. In addition to producing accurate classifications, we show how to estimate calibrated class probabilities, and motivate the importance of probability calibration. We also introduce a methodology for feature-based anomaly detection, which allows discovery of objects in the survey that do not fit within the predefined class taxonomy. Finally, we apply these methods to sources observed by the All Sky Automated Survey (ASAS), and unveil the Machine-learned ASAS Classification Catalog (MACC), which is a 28-class probabilistic classification catalog of 50,124 ASAS sources. We estimate that MACC achieves a sub-20% classification error rate, and demonstrate that the class posterior probabilities are reasonably calibrated. MACC classifications compare favorably to the classifications of several previous domain-specific ASAS papers and to the ASAS Catalog of Variable Stars, which had classified only 24% of those sources into one of 12 science classes. The MACC is publicly available at http://www.bigmacc.info.
We present a machine learning package for the classification of periodic variable stars. Our package is intended to be general: it can classify any single band optical light curve comprising at least a few tens of observations covering durations from weeks to years, with arbitrary time sampling. We use light curves of periodic variable stars taken from OGLE and EROS-2 to train the model. To make our classifier relatively survey-independent, it is trained on 16 features extracted from the light curves (e.g. period, skewness, Fourier amplitude ratio). The model classifies light curves into one of seven superclasses - Delta Scuti, RR Lyrae, Cepheid, Type II Cepheid, eclipsing binary, long-period variable, non-variable - as well as subclasses of these, such as ab, c, d, and e types for RR Lyraes. When trained to give only superclasses, our model achieves 0.98 for both recall and precision as measured on an independent validation dataset (on a scale of 0 to 1). When trained to give subclasses, it achieves 0.81 for both recall and precision. In order to assess classification performance of the subclass model, we applied it to the MACHO, LINEAR, and ASAS periodic variables, which gave recall/precision of 0.92/0.98, 0.89/0.96, and 0.84/0.88, respectively. We also applied the subclass model to Hipparcos periodic variable stars of many other variability types that do not exist in our training set, in order to examine how much those types degrade the classification performance of our target classes. In addition, we investigate how the performance varies with the number of data points and duration of observations. We find that recall and precision do not vary significantly if the number of data points is larger than 80 and the duration is more than a few weeks. The classifier software of the subclass model is available from the GitHub repository (https://goo.gl/xmFO6Q).
We describe a methodology to classify periodic variable stars identified using photometric time-series measurements constructed from the Wide-field Infrared Survey Explorer (WISE) full-mission single-exposure Source Databases. This will assist in the future construction of a WISE Variable Source Database that assigns variables to specific science classes as constrained by the WISE observing cadence with statistically meaningful classification probabilities. We have analyzed the WISE light curves of 8273 variable stars identified in previous optical variability surveys (MACHO, GCVS, and ASAS) and show that Fourier decomposition techniques can be extended into the mid-IR to assist with their classification. Combined with other periodic light-curve features, this sample is then used to train a machine-learned classifier based on the random forest (RF) method. Consistent with previous classification studies of variable stars in general, the RF machine-learned classifier is superior to other methods in terms of accuracy, robustness against outliers, and relative immunity to features that carry little or redundant class information. For the three most common classes identified by WISE: Algols, RR Lyrae, and W Ursae Majoris type variables, we obtain classification efficiencies of 80.7%, 82.7%, and 84.5% respectively using cross-validation analyses, with 95% confidence intervals of approximately +/-2%. These accuracies are achieved at purity (or reliability) levels of 88.5%, 96.2%, and 87.8% respectively, similar to that achieved in previous automated classification studies of periodic variable stars.
A novel application of machine-learning (ML) based image processing algorithms is proposed to analyze an all-sky map (ASM) obtained using the Fermi Gamma-ray Space Telescope. An attempt was made to simulate a one-year ASM from a short-exposure ASM generated from one-week observation by applying three ML based image processing algorithms: dictionary learning, U-net, and Noise2Noise. Although the inference based on ML is less clear compared to standard likelihood analysis, the quality of the ASM was generally improved. In particular, the complicated diffuse emission associated with the galactic plane was successfully reproduced only from one-week observation data to mimic a ground truth (GT) generated from a one-year observation. Such ML algorithms can be implemented relatively easily to provide sharper images without various assumptions of emission models. In contrast, large deviations between simulated ML maps and GT map were found, which are attributed to the significant temporal variability of blazar-type active galactic nuclei (AGNs) over a year. Thus, the proposed ML methods are viable not only to improve the image quality of an ASM, but also to detect variable sources, such as AGNs, algorithmically, i.e., without human bias. Moreover, we argue that this approach is widely applicable to ASMs obtained by various other missions; thus, it has the potential to examine giant structures and transient events, both of which are rarely found in pointing observations.
We present a novel automated methodology to detect and classify periodic variable stars in a large database of photometric time series. The methods are based on multivariate Bayesian statistics and use a multi-stage approach. We applied our method to the ground-based data of the TrES Lyr1 field, which is also observed by the Kepler satellite, covering ~26000 stars. We found many eclipsing binaries as well as classical non-radial pulsators, such as slowly pulsating B stars, Gamma Doradus, Beta Cephei and Delta Scuti stars. Also a few classical radial pulsators were found.
The Asteroid Terrestrial-impact Last Alert System (ATLAS) observes most of the sky every night in search of dangerous asteroids. Its data are also used to search for photometric variability, where sensitivity to variability is limited by photometric accuracy. Since each exposure spans 7.6 deg corner to corner, variations in atmospheric transparency in excess of 0.01 mag are common, and 0.01 mag photometry cannot be achieved by using a constant flat field calibration image. We therefore have assembled an all-sky reference catalog of approximately one billion stars to m~19 from a variety of sources to calibrate each exposures astrometry and photometry. Gaia DR2 is the source of astrometry for this ATLAS Refcat2. The sources of g, r, i, z photometry include Pan-STARRS DR1, the ATLAS Pathfinder photometry project, ATLAS re-flattened APASS data, SkyMapper DR1, APASS DR9, the Tycho-2 catalog, and the Yale Bright Star Catalog. We have attempted to make this catalog at least 99% complete to m<19, including the brightest stars in the sky. We believe that the systematic errors are no larger than 5 millimag RMS, although errors are as large as 20 millimag in small patches near the galactic plane.