Machine learning technique for morphological classification of galaxies from the SDSS. I. Photometry-based approach


Abstract in English

Methods. We used different galaxy classification techniques: human labeling, multi-photometry diagrams, Naive Bayes, Logistic Regression, Support Vector Machine, Random Forest, k-Nearest Neighbors, and k-fold validation. Results. We present results of a binary automated morphological classification of galaxies conducted by human labeling, multiphotometry, and supervised Machine Learning methods. We applied its to the sample of galaxies from the SDSS DR9 with 0.02 < z < 0.1 and 24m < Mr < 19.4m. To study the classifier, we used absolute magnitudes: Mu, Mg, Mr , Mi, Mz, Mu-Mr , Mg-Mi, Mu-Mg, Mr-Mz, and inverse concentration index to the center R50/R90. Using the Support vector machine classifier and the data on color indices, absolute magnitudes, inverse concentration index of galaxies with visual morphological types, we were able to classify 316 031 galaxies from the SDSS DR9 with unknown morphological types. Conclusions. The methods of Support Vector Machine and Random Forest with Scikit-learn machine learning in Python provide the highest accuracy for the binary galaxy morphological classification: 96.4% correctly classified (96.1% early E and 96.9% late L types) and 95.5% correctly classified (96.7% early E and 92.8% late L types), respectively. Applying the Support Vector Machine for the sample of 316 031 galaxies from the SDSS DR9 at z < 0.1, we found 141 211 E and 174 820 L types among them.

Download