No Arabic abstract
We demonstrate the ability of convolutional neural networks (CNNs) to mitigate systematics in the virial scaling relation and produce dynamical mass estimates of galaxy clusters with remarkably low bias and scatter. We present two models, CNN$_mathrm{1D}$ and CNN$_mathrm{2D}$, which leverage this deep learning tool to infer cluster masses from distributions of member galaxy dynamics. Our first model, CNN$_text{1D}$, infers cluster mass directly from the distribution of member galaxy line-of-sight velocities. Our second model, CNN$_text{2D}$, extends the input space of CNN$_text{1D}$ to learn on the joint distribution of galaxy line-of-sight velocities and projected radial distances. We train each model as a regression over cluster mass using a labeled catalog of realistic mock cluster observations generated from the MultiDark simulation and UniverseMachine catalog. We then evaluate the performance of each model on an independent set of mock observations selected from the same simulated catalog. The CNN models produce cluster mass predictions with lognormal residuals of scatter as low as $0.132$ dex, greater than a factor of 2 improvement over the classical $M$-$sigma$ power-law estimator. Furthermore, the CNN model reduces prediction scatter relative to similar machine learning approaches by up to $17%$ while executing in drastically shorter training and evaluation times (by a factor of 30) and producing considerably more robust mass predictions (improving prediction stability under variations in galaxy sampling rate by $30%$).
We present a modern machine learning approach for cluster dynamical mass measurements that is a factor of two improvement over using a conventional scaling relation. Different methods are tested against a mock cluster catalog constructed using halos with mass >= 10^14 Msolar/h from Multidarks publicly-available N-body MDPL halo catalog. In the conventional method, we use a standard M(sigma_v) power law scaling relation to infer cluster mass, M, from line-of-sight (LOS) galaxy velocity dispersion, sigma_v. The resulting fractional mass error distribution is broad, with width=0.87 (68% scatter), and has extended high-error tails. The standard scaling relation can be simply enhanced by including higher-order moments of the LOS velocity distribution. Applying the kurtosis as a correction term to log(sigma_v) reduces the width of the error distribution to 0.74 (16% improvement). Machine learning can be used to take full advantage of all the information in the velocity distribution. We employ the Support Distribution Machines (SDMs) algorithm that learns from distributions of data to predict single values. SDMs trained and tested on the distribution of LOS velocities yield width=0.46 (47% improvement). Furthermore, the problematic tails of the mass error distribution are effectively eliminated. Decreasing cluster mass errors will improve measurements of the growth of structure and lead to tighter constraints on cosmological parameters.
We study dynamical mass measurements of galaxy clusters contaminated by interlopers and show that a modern machine learning (ML) algorithm can predict masses by better than a factor of two compared to a standard scaling relation approach. We create two mock catalogs from Multidarks publicly available $N$-body MDPL1 simulation, one with perfect galaxy cluster membership information and the other where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power-law scaling relation to infer cluster mass from galaxy line-of-sight (LOS) velocity dispersion. Assuming perfect membership knowledge, this unrealistic case produces a wide fractional mass error distribution, with a width of $Deltaepsilonapprox0.87$. Interlopers introduce additional scatter, significantly widening the error distribution further ($Deltaepsilonapprox2.13$). We employ the support distribution machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to distributions of galaxy observables such as LOS velocity and projected distance from the cluster center, SDM yields better than a factor-of-two improvement ($Deltaepsilonapprox0.67$) for the contaminated case. Remarkably, SDM applied to contaminated clusters is better able to recover masses than even the scaling relation approach applied to uncontaminated clusters. We show that the SDM method more accurately reproduces the cluster mass function, making it a valuable tool for employing cluster observations to evaluate cosmological models.
We study methods for reconstructing Bayesian uncertainties on dynamical mass estimates of galaxy clusters using convolutional neural networks (CNNs). We discuss the statistical background of approximate Bayesian neural networks and demonstrate how variational inference techniques can be used to perform computationally tractable posterior estimation for a variety of deep neural architectures. We explore how various model designs and statistical assumptions impact prediction accuracy and uncertainty reconstruction in the context of cluster mass estimation. We measure the quality of our model posterior recovery using a mock cluster observation catalog derived from the MultiDark simulation and UniverseMachine catalog. We show that approximate Bayesian CNNs produce highly accurate dynamical cluster mass posteriors. These model posteriors are log-normal in cluster mass and recover $68%$ and $90%$ confidence intervals to within $1%$ of their measured value. We note how this rigorous modeling of dynamical mass posteriors is necessary for using cluster abundance measurements to constrain cosmological parameters.
In light of the tension in cosmological constraints reported by the Planck team between their SZ-selected cluster counts and Cosmic Microwave Background (CMB) temperature anisotropies, we compare the Planck cluster mass estimates with robust, weak-lensing mass measurements from the Weighing the Giants (WtG) project. For the 22 clusters in common between the Planck cosmology sample and WtG, we find an overall mass ratio of $left< M_{Planck}/M_{rm WtG} right> = 0.688 pm 0.072$. Extending the sample to clusters not used in the Planck cosmology analysis yields a consistent value of $left< M_{Planck}/M_{rm WtG} right> = 0.698 pm 0.062$ from 38 clusters in common. Identifying the weak-lensing masses as proxies for the true cluster mass (on average), these ratios are $sim 1.6sigma$ lower than the default mass bias of 0.8 assumed in the Planck cluster analysis. Adopting the WtG weak-lensing-based mass calibration would substantially reduce the tension found between the Planck cluster count cosmology results and those from CMB temperature anisotropies, thereby dispensing of the need for new physics such as uncomfortably large neutrino masses (in the context of the measured Planck temperature anisotropies and other data). We also find modest evidence (at 95 per cent confidence) for a mass dependence of the calibration ratio and discuss its potential origin in light of systematic uncertainties in the temperature calibration of the X-ray measurements used to calibrate the Planck cluster masses. Our results exemplify the critical role that robust absolute mass calibration plays in cluster cosmology, and the invaluable role of accurate weak-lensing mass measurements in this regard.
We present an algorithm for inferring the dynamical mass of galaxy clusters directly from their respective phase-space distributions, i.e. the observed line-of-sight velocities and projected distances of galaxies from the cluster centre. Our method employs normalizing flows, a deep neural network capable of learning arbitrary high-dimensional probability distributions, and inherently accounts, to an adequate extent, for the presence of interloper galaxies which are not bounded to a given cluster, the primary contaminant of dynamical mass measurements. We validate and showcase the performance of our neural flow approach to robustly infer the dynamical mass of clusters from a realistic mock cluster catalogue. A key aspect of our novel algorithm is that it yields the probability density function of the mass of a particular cluster, thereby providing a principled way of quantifying uncertainties, in contrast to conventional machine learning approaches. The neural network mass predictions, when applied to a contaminated catalogue with interlopers, have a mean overall logarithmic residual scatter of 0.028 dex, with a log-normal scatter of 0.126 dex, which goes down to 0.089 dex for clusters in the intermediate to high mass range. This is an improvement by nearly a factor of four relative to the classical cluster mass scaling relation with the velocity dispersion, and outperforms recently proposed machine learning approaches. We also apply our neural flow mass estimator to a compilation of galaxy observations of some well-studied clusters with robust dynamical mass estimates, further substantiating the efficacy of our algorithm.