Photometric Classifications of Evolved Massive Stars: Preparing for the Era of Webb and Roman with Machine Learning


Abstract in English

In the coming years, next-generation space-based infrared observatories will significantly increase our samples of rare massive stars, representing a tremendous opportunity to leverage modern statistical tools and methods to test massive stellar evolution in entirely new environments. Such work is only possible if the observed objects can be reliably classified. Spectroscopic observations are infeasible with more distant targets, and so we wish to determine whether machine learning methods can classify massive stars using broadband infrared photometry. We find that a Support Vector Machine classifier is capable of coarsely classifying massive stars with labels corresponding to hot, cool, and emission line stars with high accuracy, while rejecting contaminating low mass giants. Remarkably, 76% of emission line stars can be recovered without the need for narrowband or spectroscopic observations. We classify a sample of ${sim}2500$ objects with no existing labels, and identify fourteen candidate emission line objects. Unfortunately, despite the high precision of the photometry in our sample, the heterogeneous origins of the labels for the stars in our sample severely inhibits our classifier from distinguishing classes of stars with more granularity. Ultimately, no large and homogeneously labeled sample of massive stars currently exists. Without significant efforts to robustly classify evolved massive stars -- which is feasible given existing data from large all-sky spectroscopic surveys -- shortcomings in the labeling of existing data sets will hinder efforts to leverage the next-generation of space observatories.

Download