Real-Time Value-Driven Data Augmentation in the Era of LSST


Abstract in English

The deluge of data from time-domain surveys is rendering traditional human-guided data collection and inference techniques impractical. We propose a novel approach for conducting data collection for science inference in the era of massive large-scale surveys that uses value-based metrics to autonomously strategize and co-ordinate follow-up in real-time. We demonstrate the underlying principles in the Recommender Engine For Intelligent Transient Tracking (REFITT) that ingests live alerts from surveys and value-added inputs from data brokers to predict the future behavior of transients and design optimal data augmentation strategies given a set of scientific objectives. The prototype presented in this paper is tested to work given simulated Rubin Observatory Legacy Survey of Space and Time (LSST) core-collapse supernova (CC SN) light-curves from the PLAsTiCC dataset. CC SNe were selected for the initial development phase as they are known to be difficult to classify, with the expectation that any learning techniques for them should be at least as effective for other transients. We demonstrate the behavior of REFITT on a random LSST night given ~32000 live CC SNe of interest. The system makes good predictions for the photometric behavior of the events and uses them to plan follow-up using a simple data-driven metric. We argue that machine-directed follow-up maximizes the scientific potential of surveys and follow-up resources by reducing downtime and bias in data collection.

Download