Robust Assessment of Clustering Methods for Fast Radio Transient Candidates


Abstract in English

Fast radio transient search algorithms identify signals of interest by iterating and applying a threshold on a set of matched filters. These filters are defined by properties of the transient such as time and dispersion. A real transient can trigger hundreds of search trials, each of which has to be post-processed for visualization and classification tasks. In this paper, we have explored a range of unsupervised clustering algorithms to cluster these redundant candidate detections. We demonstrate this for Realfast, the commensal fast transient search system at the Very Large Array. We use four features for clustering: sky position (l, m), time and dispersion measure (DM). We develop a custom performance metric that makes sure that the candidates are clustered into a small number of pure clusters, i.e, clusters with either astrophysical or noise candidates. We then use this performance metric to compare eight different clustering algorithms. We show that using sky location along with DM/time improves clustering performance by $sim$10% as compared to the traditional DM/time-based clustering. Therefore, positional information should be used during clustering if it can be made available. We conduct several tests to compare the performance and generalisability of clustering algorithms to other transient datasets and propose a strategy that can be used to choose an algorithm. Our performance metric and clustering strategy can be easily extended to different single-pulse search pipelines and other astronomy and non-astronomy-based applications.

Download