Machine learning methods for constructing probabilistic Fermi-LAT catalogs


Abstract in English

Classification of sources is one of the most important tasks in astronomy. Sources detected in one wavelength band, for example using gamma rays, may have several possible associations in other wavebands or there may be no plausible association candidates. In this work, we aim to determine probabilistic classification of unassociated sources in the third and the fourth data release 2 Fermi Large Area Telescope (LAT) point source catalogs (3FGL and 4FGL-DR2) into two classes (pulsars and active galactic nuclei (AGNs)) or three classes (pulsars, AGNs, and other sources). We use several machine learning (ML) methods to determine probabilistic classification of Fermi-LAT sources. We evaluate the dependence of results on meta-parameters of the ML methods, such as the maximal depth of the trees in tree-based classification methods and the number of neurons in neural networks. We determine probabilistic classification of both associated and unassociated sources in 3FGL and 4FGL-DR2 catalogs. We cross-check the accuracy by comparing the predicted classes of unassociated sources in 3FGL that have associations in 4FGL-DR2. We find that in the 2-class case it is important to correct for the presence of other sources among the unassociated ones in order to realistically estimate the number of pulsars and AGNs. In particular, the estimated number of pulsars in the 3FGL (4FGL-DR2) catalog is 270 (483) in the 2-class case without corrections for the other sources and 158 (215) in the 3-class case. Provided that the number of associated pulsars is 167 (271) in the 3FGL (4FGL-DR2) catalog, the number of pulsars among the unassociated sources is expected to be similar or larger than the number of associated ones.

Download