Intervening or associated? Machine learning classification of redshifted H I 21-cm absorption


Abstract in English

In a previous paper we presented the results of applying machine learning to classify whether an HI 21-cm absorption spectrum arises in a source intervening the sight-line to a more distant radio source or within the host of the radio source itself. This is usually determined from an optical spectrum giving the source redshift. However, not only will this be impractical for the large number of sources expected to be detected with the Square Kilometre Array, but bright optical sources are the most ultra-violet luminous at high redshift and so bias against the detection of cool, neutral gas. Adding another 44, mostly newly detected absorbers, to the previous sample of 92, we test four different machine learning algorithms, again using the line properties (width, depth and number of Gaussian fits) as features. Of these algorithms, three gave a some improvement over the previous sample, with a logistic regression model giving the best results. This suggests that the inclusion of further training data, as new absorbers are detected, will further increase the prediction accuracy above the current 80%. We use the logistic regression model to classify the z = 0.42 absorption towards PKS 1657-298 and find this to be associated, which is consistent with a previous study which determined a similar redshift from the K-band magnitude-redshift relation.

Download