Probabilistic multi-catalogue positional cross-match


Abstract in English

We lay the foundations of a statistical framework for multi-catalogue cross-correlation and cross-identification based on explicit simplified catalogue models. A proper identification process should rely on both astrometric and photometric data. Under some conditions, the astrometric part and the photometric part can be processed separately and merged a posteriori to provide a single global probability of identification. The present paper addresses almost exclusively the astrometrical part and specifies the proper probabilities to be merged with photometric likelihoods. To select matching candidates in n catalogues, we used the Chi (or, indifferently, the Chi-square) test with 2(n-1) degrees of freedom. We thus call this cross-match a chi-match. In order to use Bayes formula, we considered exhaustive sets of hypotheses based on combinatorial analysis. The volume of the Chi-test domain of acceptance -- a 2(n-1)-dimensional acceptance ellipsoid -- is used to estimate the expected numbers of spurious associations. We derived priors for those numbers using a frequentist approach relying on simple geometrical considerations. Likelihoods are based on standard Rayleigh, Chi and Poisson distributions that we normalized over the Chi-test acceptance domain. We validated our theoretical results by generating and cross-matching synthetic catalogues. The results we obtain do not depend on the order used to cross-correlate the catalogues. We applied the formalism described in the present paper to build the multi-wavelength catalogues used for the science cases of the ARCHES (Astronomical Resource Cross-matching for High Energy Studies) project. Our cross-matching engine is publicly available through a multi-purpose web interface. In a longer term, we plan to integrate this tool into the CDS XMatch Service.

Download