A Machine Learning Method to Infer Fundamental Stellar Parameters from Photometric Light Curves


Abstract in English

A fundamental challenge for wide-field imaging surveys is obtaining follow-up spectroscopic observations: there are > $10^9$ photometrically cataloged sources, yet modern spectroscopic surveys are limited to ~few x $10^6$ targets. As we approach the Large Synoptic Survey Telescope (LSST) era, new algorithmic solutions are required to cope with the data deluge. Here we report the development of a machine-learning framework capable of inferring fundamental stellar parameters (Teff, log g, and [Fe/H]) using photometric-brightness variations and color alone. A training set is constructed from a systematic spectroscopic survey of variables with Hectospec/MMT. In sum, the training set includes ~9000 spectra, for which stellar parameters are measured using the SEGUE Stellar Parameters Pipeline (SSPP). We employed the random forest algorithm to perform a non-parametric regression that predicts Teff, log g, and [Fe/H] from photometric time-domain observations. Our final, optimized model produces a cross-validated root-mean-square error (RMSE) of 165 K, 0.39 dex, and 0.33 dex for Teff, log g, and [Fe/H], respectively. Examining the subset of sources for which the SSPP measurements are most reliable, the RMSE reduces to 125 K, 0.37 dex, and 0.27 dex, respectively, comparable to what is achievable via low-resolution spectroscopy. For variable stars this represents a ~12-20% improvement in RMSE relative to models trained with single-epoch photometric colors. As an application of our method, we estimate stellar parameters for ~54,000 known variables. We argue that this method may convert photometric time-domain surveys into pseudo-spectrographic engines, enabling the construction of extremely detailed maps of the Milky Way, its structure, and history.

Download