No Arabic abstract
State of the art exoplanet transit surveys are producing ever increasing quantities of data. To make the best use of this resource, in detecting interesting planetary systems or in determining accurate planetary population statistics, requires new automated methods. Here we describe a machine learning algorithm that forms an integral part of the pipeline for the NGTS transit survey, demonstrating the efficacy of machine learning in selecting planetary candidates from multi-night ground based survey data. Our method uses a combination of random forests and self-organising-maps to rank planetary candidates, achieving an AUC score of 97.6% in ranking 12368 injected planets against 27496 false positives in the NGTS data. We build on past examples by using injected transit signals to form a training set, a necessary development for applying similar methods to upcoming surveys. We also make the texttt{autovet} code used to implement the algorithm publicly accessible. texttt{autovet} is designed to perform machine learned vetting of planetary candidates, and can utilise a variety of methods. The apparent robustness of machine learning techniques, whether on space-based or the qualitatively different ground-based data, highlights their importance to future surveys such as TESS and PLATO and the need to better understand their advantages and pitfalls in an exoplanetary context.
The Transiting Exoplanet Survey Satellite (TESS) has now been operational for a little over two years, covering the Northern and the Southern hemispheres once. The TESS team processes the downlinked data using the Science Processing Operations Center pipeline and Quick Look pipeline to generate alerts for follow-up. Combined with other efforts from the community, over two thousand planet candidates have been found of which tens have been confirmed as planets. We present our pipeline, Nigraha, that is complementary to these approaches. Nigraha uses a combination of transit finding, supervised machine learning, and detailed vetting to identify with high confidence a few planet candidates that were missed by prior searches. In particular, we identify high signal to noise ratio (SNR) shallow transits that may represent more Earth-like planets. In the spirit of open data exploration we provide details of our pipeline, release our supervised machine learning model and code as open source, and make public the 38 candidates we have found in seven sectors. The model can easily be run on other sectors as is. As part of future work we outline ways to increase the yield by strengthening some of the steps where we have been conservative and discarded objects for lack of a datum or two.
We present TRICERATOPS, a new Bayesian tool that can be used to vet and validate TESS Objects of Interest (TOIs). We test the tool on 68 TOIs that have been previously confirmed as planets or rejected as astrophysical false positives. By looking in the false positive probability (FPP) -- nearby false positive probability (NFPP) plane, we define criteria that TOIs must meet to be classified as validated planets (FPP < 0.015 and NFPP < 10^-3), likely planets (FPP < 0.5 and NFPP < 10^-3), and likely nearby false positives (NFPP > 10^-1). We apply this procedure on 384 unclassified TOIs and statistically validate 12, classify 125 as likely planets, and classify 52 as likely nearby false positives. Of the 12 statistically validated planets, 9 are newly validated. TRICERATOPS is currently the only TESS vetting and validation tool that models transits from nearby contaminant stars in addition to the target star. We therefore encourage use of this tool to prioritize follow-up observations that confirm bona fide planets and identify false positives originating from nearby stars.
Since the start of the Wide Angle Search for Planets (WASP) program, more than 160 transiting exoplanets have been discovered in the WASP data. In the past, possible transit-like events identified by the WASP pipeline have been vetted by human inspection to eliminate false alarms and obvious false positives. The goal of the present paper is to assess the effectiveness of machine learning as a fast, automated, and reliable means of performing the same functions on ground-based wide-field transit-survey data without human intervention. To this end, we have created training and test datasets made up of stellar light curves showing a variety of signal types including planetary transits, eclipsing binaries, variable stars, and non-periodic signals. We use a combination of machine learning methods including Random Forest Classifiers (RFCs) and Convolutional Neural Networks (CNNs) to distinguish between the different types of signals. The final algorithms correctly identify planets in the test data ~90% of the time, although each method on its own has a significant fraction of false positives. We find that in practice, a combination of different methods offers the best approach to identifying the most promising exoplanet transit candidates in data from WASP, and by extension similar transit surveys.
The Kepler Mission was designed to identify and characterize transiting planets in the Kepler Field of View and to determine their occurrence rates. Emphasis was placed on identification of Earth-size planets orbiting in the Habitable Zone of their host stars. Science data were acquired for a period of four years. Long-cadence data with 29.4 min sampling were obtained for ~200,000 individual stellar targets in at least one observing quarter in the primary Kepler Mission. Light curves for target stars are extracted in the Kepler Science Data Processing Pipeline, and are searched for transiting planet signatures. A Threshold Crossing Event is generated in the transit search for targets where the transit detection threshold is exceeded and transit consistency checks are satisfied. These targets are subjected to further scrutiny in the Data Validation (DV) component of the Pipeline. Transiting planet candidates are characterized in DV, and light curves are searched for additional planets after transit signatures are modeled and removed. A suite of diagnostic tests is performed on all candidates to aid in discrimination between genuine transiting planets and instrumental or astrophysical false positives. Data products are generated per target and planet candidate to document and display transiting planet model fit and diagnostic test results. These products are exported to the Exoplanet Archive at the NASA Exoplanet Science Institute, and are available to the community. We describe the DV architecture and diagnostic tests, and provide a brief overview of the data products. Transiting planet modeling and the search for multiple planets on individual targets are described in a companion paper. The final revision of the Kepler Pipeline code base is available to the general public through GitHub. The Kepler Pipeline has also been modified to support the TESS Mission which will commence in 2018.
NASAs Transiting Exoplanet Survey Satellite (TESS) presents us with an unprecedented volume of space-based photometric observations that must be analyzed in an efficient and unbiased manner. With at least $sim1,000,000$ new light curves generated every month from full frame images alone, automated planet candidate identification has become an attractive alternative to human vetting. Here we present a deep learning model capable of performing triage and vetting on TESS candidates. Our model is modified from an existing neural network designed to automatically classify Kepler candidates, and is the first neural network to be trained and tested on real TESS data. In triage mode, our model can distinguish transit-like signals (planet candidates and eclipsing binaries) from stellar variability and instrumental noise with an average precision (the weighted mean of precisions over all classification thresholds) of 97.0% and an accuracy of 97.4%. In vetting mode, the model is trained to identify only planet candidates with the help of newly added scientific domain knowledge, and achieves an average precision of 69.3% and an accuracy of 97.8%. We apply our model on new data from Sector 6, and present 288 new signals that received the highest scores in triage and vetting and were also identified as planet candidates by human vetters. We also provide a homogeneously classified set of TESS candidates suitable for future training.