Cosmological cross-correlations and nearest neighbor distributions

73 0 0.0 ( 0 )

Download Cite

Added by Arka Banerjee

Publication date 2021

fields Physics

and research's language is English

Authors Arka Banerjee - Tom Abel

Cosmology and Nongalactic Astrophysics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Cross-correlations between datasets are used in many different contexts in cosmological analyses. Recently, $k$-Nearest Neighbor Cumulative Distribution Functions ($k{rm NN}$-${rm CDF}$) were shown to be sensitive probes of cosmological (auto) clustering. In this paper, we extend the framework of nearest neighbor measurements to describe joint distributions of, and correlations between, two datasets. We describe the measurement of joint $k{rm NN}$-${rm CDF}$s, and show that these measurements are sensitive to all possible connected $N$-point functions that can be defined in terms of the two datasets. We describe how the cross-correlations can be isolated by combining measurements of the joint $k{rm NN}$-${rm CDF}$s and those measured from individual datasets. We demonstrate the application of these measurements in the context of Gaussian density fields, as well as for fully nonlinear cosmological datasets. Using a Fisher analysis, we show that measurements of the halo-matter cross-correlations, as measured through nearest neighbor measurements are more sensitive to the underlying cosmological parameters, compared to traditional two-point cross-correlation measurements over the same range of scales. Finally, we demonstrate how the nearest neighbor cross-correlations can robustly detect cross correlations between sparse samples -- the same regime where the two-point cross-correlation measurements are dominated by noise.

rate research

Nearest Neighbor distributions: new statistical measures for cosmological clustering

171 - Arka Banerjee , Tom Abel 2020

The use of summary statistics beyond the two-point correlation function to analyze the non-Gaussian clustering on small scales is an active field of research in cosmology. In this paper, we explore a set of new summary statistics -- the $k$-Nearest Neighbor Cumulative Distribution Functions ($k{rm NN}$-${rm CDF}$). This is the empirical cumulative distribution function of distances from a set of volume-filling, Poisson distributed random points to the $k$-nearest data points, and is sensitive to all connected $N$-point correlations in the data. The $k{rm NN}$-${rm CDF}$ can be used to measure counts in cell, void probability distributions and higher $N$-point correlation functions, all using the same formalism exploiting fast searches with spatial tree data structures. We demonstrate how it can be computed efficiently from various data sets - both discrete points, and the generalization for continuous fields. We use data from a large suite of $N$-body simulations to explore the sensitivity of this new statistic to various cosmological parameters, compared to the two-point correlation function, while using the same range of scales. We demonstrate that the use of $k{rm NN}$-${rm CDF}$ improves the constraints on the cosmological parameters by more than a factor of $2$ when applied to the clustering of dark matter in the range of scales between $10h^{-1}{rm Mpc}$ and $40h^{-1}{rm Mpc}$. We also show that relative improvement is even greater when applied on the same scales to the clustering of halos in the simulations at a fixed number density, both in real space, as well as in redshift space. Since the $k{rm NN}$-${rm CDF}$ are sensitive to all higher order connected correlation functions in the data, the gains over traditional two-point analyses are expected to grow as progressively smaller scales are included in the analysis of cosmological data.

Cosmology and Nongalactic Astrophysics

96 - Lehman H. Garrison , Tom Abel , Daniel J. Eisenstein 2021

We use the $k$-nearest neighbor probability distribution function ($k$NN-PDF, Banerjee & Abel 2021) to assess convergence in a scale-free $N$-body simulation. Compared to our previous two-point analysis, the $k$NN-PDF allows us to quantify our results in the language of halos and numbers of particles, while also incorporating non-Gaussian information. We find good convergence for 32 particles and greater at densities typical of halos, while 16 particles and fewer appears unconverged. Halving the softening length extends convergence to higher densities, but not to fewer particles. Our analysis is less sensitive to voids, but we analyze a limited range of underdensities and find evidence for convergence at 16 particles and greater even in sparse voids.

Cosmology and Nongalactic Astrophysics

Modeling Nearest Neighbor distributions of biased tracers using Hybrid Effective Field Theory

184 - Arka Banerjee , Nickolas Kokron , Tom Abel 2021

We investigate the application of Hybrid Effective Field Theory (HEFT) -- which combines a Lagrangian bias expansion with subsequent particle dynamics from $N$-body simulations -- to the modeling of $k$-Nearest Neighbor Cumulative Distribution Functions ($k{rm NN}$-${rm CDF}$s) of biased tracers of the cosmological matter field. The $k{rm NN}$-${rm CDF}$s are sensitive to all higher order connected $N$-point functions in the data, but are computationally cheap to compute. We develop the formalism to predict the $k{rm NN}$-${rm CDF}$s of discrete tracers of a continuous field from the statistics of the continuous field itself. Using this formalism, we demonstrate how $k{rm NN}$-${rm CDF}$ statistics of a set of biased tracers, such as halos or galaxies, of the cosmological matter field can be modeled given a set of low-redshift HEFT component fields and bias parameter values. These are the same ingredients needed to predict the two-point clustering. For a specific sample of halos, we show that both the two-point clustering textit{and} the $k{rm NN}$-${rm CDF}$s can be well-fit on quasi-linear scales ($gtrsim 20 h^{-1}{rm Mpc}$) by the second-order HEFT formalism with the textit{same values} of the bias parameters, implying that joint modeling of the two is possible. Finally, using a Fisher matrix analysis, we show that including $k{rm NN}$-${rm CDF}$ measurements over the range of allowed scales in the HEFT framework can improve the constraints on $sigma_8$ by roughly a factor of $3$, compared to the case where only two-point measurements are considered. Combining the statistical power of $k{rm NN}$ measurements with the modeling power of HEFT, therefore, represents an exciting prospect for extracting greater information from small-scale cosmological clustering.

Cosmology and Nongalactic Astrophysics

Recovering Redshift Distributions with Cross-Correlations: Pushing The Boundaries

360 - Samuel Schmidt 2013

Determining accurate redshift distributions for very large samples of objects has become increasingly important in cosmology. We investigate the impact of extending cross-correlation based redshift distribution recovery methods to include small scale clustering information. The major concern in such work is the ability to disentangle the amplitude of the underlying redshift distribution from the influence of evolving galaxy bias. Using multiple simulations covering a variety of galaxy bias evolution scenarios, we demonstrate reliable redshift recoveries using linear clustering assumptions well into the non-linear regime for redshift distributions of narrow redshift width. Including information from intermediate physical scales balances the increased information available from clustering and the residual bias incurred from relaxing of linear constraints. We discuss how breaking a broad sample into tomographic bins can improve estimates of the redshift distribution, and present a simple bias removal technique using clustering information from the spectroscopic sample alone.

Cosmology and Nongalactic Astrophysics

Generalized Nearest Neighbor Decoding

103 - Yizhu Wang , Wenyi Zhang 2020

It is well known that for linear Gaussian channels, a nearest neighbor decoding rule, which seeks the minimum Euclidean distance between a codeword and the received channel output vector, is the maximum likelihood solution and hence capacity-achieving. Nearest neighbor decoding remains a convenient and yet mismatched solution for general channels, and the key message of this paper is that the performance of the nearest neighbor decoding can be improved by generalizing its decoding metric to incorporate channel state dependent output processing and codeword scaling. Using generalized mutual information, which is a lower bound to the mismatched capacity under independent and identically distributed codebook ensemble, as the performance measure, this paper establishes the optimal generalized nearest neighbor decoding rule, under Gaussian channel input. Several suboptimal but reduced-complexity generalized nearest neighbor decoding rules are also derived and compared with existing solutions. The results are illustrated through several case studies for channels with nonlinear effects, and fading channels with receiver channel state information or with pilot-assisted training.

Information Theory Information Theory