Decision Tree Classifiers for Star/Galaxy Separation


Abstract in English

We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of $884,126$ SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals: $14le rle21$ ($85.2%$) and $rge19$ ($82.1%$). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT and Ball et al. (2006). We find that our FT classifier is comparable or better in completeness over the full magnitude range $15le rle21$, with much lower contamination than all but the Ball et al. classifier. At the faintest magnitudes ($r>19$), our classifier is the only one able to maintain high completeness ($>$80%) while still achieving low contamination ($sim2.5%$). Finally, we apply our FT classifier to separate stars from galaxies in the full set of $69,545,326$ SDSS photometric objects in the magnitude range $14le rle21$.

Download