Selection Functions in Astronomical Data Modeling, with the Space Density of White Dwarfs as Worked Example


Abstract in English

Statistical studies of astronomical data sets, in particular of cataloged properties for discrete objects, are central to astrophysics. One cannot model those objects population properties or incidences without a quantitative understanding of the conditions under which these objects ended up in a catalog or sample, the samples selection function. As systematic and didactic introductions to this topic are scarce in the astrophysical literature, we aim to provide one, addressing generically the following questions: What is a selection function? What arguments $vec{q}$ should a selection function depend on? Over what domain must a selection function be defined? What approximations and simplifications can be made? And, how is a selection function used in `modelling? We argue that volume-complete samples, with the volume drastically curtailed by the faintest objects, reflect a highly sub-optimal selection function that needlessly reduces the number of bright and usually rare objects in the sample. We illustrate these points by a worked example, deriving the space density of white dwarfs (WD) in the Galactic neighbourhood as a function of their luminosity and Gaia color, $Phi_0(M_G,B-R)$ in [mag$^{-2}$pc$^{-3}$]. We construct a sample of $10^5$ presumed WDs through straightforward selection cuts on the Gaia EDR3 catalog, in magnitude, color, parallax, and astrometric fidelity $vec{q}=(m_G,B-R,varpi,p_{af})$. We then combine a simple model for $Phi_0$ with the effective survey volume derived from this selection function $S_C(vec{q})$ to derive a detailed and robust estimate of $Phi_0(M_G,B-R)$. This resulting white dwarf luminosity-color function $Phi_0(M_G,B-R)$ differs dramatically from the initial number density distribution in the luminosity-color plane: by orders of magnitude in density and by four magnitudes in density peak location.

Download