No Arabic abstract
Higher-order clustering statistics, like the galaxy bispectrum, can add complementary cosmological information to what is accessible with two-point statistics, like the power spectrum. While the standard way of measuring the bispectrum involves estimating a bispectrum value in a large number of Fourier triangle bins, the compressed modal bispectrum approximates the bispectrum as a linear combination of basis functions and estimates the expansion coefficients on the chosen basis. In this work, we compare the two estimators by using parallel pipelines to analyze the real-space halo bispectrum measured in a suite of $N$-body simulations corresponding to a total volume of $sim 1{,}000 ,h^{-3},{rm Gpc}^3$, with covariance matrices estimated from 10,000 mock halo catalogs. We find that the modal bispectrum yields constraints that are consistent and competitive with the standard bispectrum analysis: for the halo bias and shot noise parameters within the tree-level halo bispectrum model up to $k_{rm max} approx 0.06 , (0.10) ,h,{rm Mpc}^{-1}$, only 6 (10) modal expansion coefficients are necessary to obtain constraints equivalent to the standard bispectrum estimator using $sim$ 20 to 1,600 triangle bins, depending on the bin width. For this work, we have implemented a modal estimator pipeline using Markov Chain Monte Carlo simulations for the first time, and we discuss in detail how the parameter posteriors and modal expansion are robust to, or sensitive to, several user settings within the modal bispectrum pipeline. The combination of the highly efficient compression that is achieved and the large number of mock catalogs available allows us to quantify how our modal bispectrum constraints depend on the number of mocks that are used to estimate covariance matrices and the functional form of the likelihood.
Over the next decade, improvements in cosmological parameter constraints will be driven by surveys of large-scale structure. Its inherent non-linearity suggests that significant information will be embedded in higher correlations beyond the two-point function. Extracting this information is extremely challenging: it requires accurate theoretical modelling and significant computational resources to estimate the covariance matrix describing correlations between different Fourier configurations. We investigate whether it is possible to reduce the covariance matrix without significant loss of information by using a proxy that aggregates the bispectrum over a subset of Fourier configurations. Specifically, we study the constraints on $Lambda$CDM parameters from combining the power spectrum with (a) the modal bispectrum decomposition, (b) the line correlation function and (c) the integrated bispectrum. We forecast the error bars achievable on $Lambda$CDM parameters using these proxies in a future galaxy survey and compare them to those obtained from measurements of the Fourier bispectrum, including simple estimates of their degradation in the presence of shot noise. Our results demonstrate that the modal bispectrum performs as well as the Fourier bispectrum, even with considerably fewer modes than Fourier configurations. The line correlation function has good performance but does not match the modal bispectrum. The integrated bispectrum is comparatively insensitive to changes in the background cosmology. We find that adding bispectrum data can improve constraints on bias parameters and the normalization $sigma_8$ by up to 5 compared to power spectrum measurements alone. For other parameters, improvements of up to $sim$ 20% are possible. Finally, we use a range of theoretical models to explore how the sophistication required for realistic predictions varies with each proxy. (abridged)
We apply two compression methods to the galaxy power spectrum monopole/quadrupole and bispectrum monopole measurements from the BOSS DR12 CMASS sample. Both methods reduce the dimension of the original data-vector to the number of cosmological parameters considered, using the Karhunen-Lo`eve algorithm with an analytic covariance model. In the first case, we infer the posterior through MCMC sampling from the likelihood of the compressed data-vector (MC-KL). The second, faster option, works by first Gaussianising and then orthogonalising the parameter space before the compression; in this option (G-PCA) we only need to run a low-resolution preliminary MCMC sample for the Gaussianization to compute our posterior. Both compression methods accurately reproduce the posterior distributions obtained by standard MCMC sampling on the CMASS dataset for a $k$-space range of $0.03-0.12,h/mathrm{Mpc}$. The compression enables us to increase the number of bispectrum measurements by a factor of $sim 23$ over the standard binning (from 116 to 2734 triangles used), which is otherwise limited by the number of mock catalogues available. This reduces the $68%$ credible intervals for the parameters $left(b_1,b_2,f,sigma_8right)$ by $left(-24.8%,-52.8%,-26.4%,-21%right)$, respectively. The best-fit values we obtain are $(b_1=2.31pm0.17,b_2=0.77pm0.19,$ $f(z_{mathrm{CMASS}})=0.67pm0.06,sigma_8(z_{mathrm{CMASS}})=0.51pm0.03)$. Using these methods for future redshift surveys like DESI, Euclid and PFS will drastically reduce the number of simulations needed to compute accurate covariance matrices and will facilitate tighter constraints on cosmological parameters.
Clustering of large-scale structure provides significant cosmological information through the power spectrum of density perturbations. Additional information can be gained from higher-order statistics like the bispectrum, especially to break the degeneracy between the linear halo bias $b_1$ and the amplitude of fluctuations $sigma_8$. We propose new simple, computationally inexpensive bispectrum statistics that are near optimal for the specific applications like bias determination. Corresponding to the Legendre decomposition of nonlinear halo bias and gravitational coupling at second order, these statistics are given by the cross-spectra of the density with three quadratic fields: the squared density, a tidal term, and a shift term. For halos and galaxies the first two have associated nonlinear bias terms $b_2$ and $b_{s^2}$, respectively, while the shift term has none in the absence of velocity bias (valid in the $k rightarrow 0$ limit). Thus the linear bias $b_1$ is best determined by the shift cross-spectrum, while the squared density and tidal cross-spectra mostly tighten constraints on $b_2$ and $b_{s^2}$ once $b_1$ is known. Since the form of the cross-spectra is derived from optimal maximum-likelihood estimation, they contain the full bispectrum information on bias parameters. Perturbative analytical predictions for their expectation values and covariances agree with simulations on large scales, $klesssim 0.09h/mathrm{Mpc}$ at $z=0.55$ with Gaussian $R=20h^{-1}mathrm{Mpc}$ smoothing, for matter-matter-matter, and matter-matter-halo combinations. For halo-halo-halo cross-spectra the model also needs to include corrections to the Poisson stochasticity.
When analyzing the galaxy bispectrum measured from spectroscopic surveys, it is imperative to account for the effects of non-uniform survey geometry. Conventionally, this is done by convolving the theory model with the the window function; however, the computational expense of this prohibits full exploration of the bispectrum likelihood. In this work, we provide a new class of estimators for the unwindowed bispectrum; a quantity that can be straightforwardly compared to theory. This builds upon the work of Philcox (2021) for the power spectrum, and comprises two parts (both obtained from an Edgeworth expansion): a cubic estimator applied to the data, and a Fisher matrix, which deconvolves the bispectrum components. In the limit of weak non-Gaussianity, the estimator is minimum-variance; furthermore, we give an alternate form based on FKP weights that is close-to-optimal and easy to compute. As a demonstration, we measure the binned bispectrum monopole of a suite of simulations both using conventional estimators and our unwindowed equivalents. Computation times are comparable, except that the unwindowed approach requires a Fisher matrix, computable in an additional $mathcal{O}(100)$ CPU-hours. Our estimator may be straightforwardly extended to measure redshift-space distortions and the components of the bispectrum in arbitrary separable bases. The techniques of this work will allow the bispectrum to straightforwardly included in the cosmological analysis of current and upcoming survey data.
Upcoming galaxy redshift surveys promise to significantly improve current limits on primordial non-Gaussianity (PNG) through measurements of 2- and 3-point correlation functions in Fourier space. However, realizing the full potential of this dataset is contingent upon having both accurate theoretical models and optimized analysis methods. Focusing on the local model of PNG, parameterized by $f_{rm NL}$, we perform a Monte-Carlo Markov Chain analysis to confront perturbation theory predictions of the halo power spectrum and bispectrum in real space against a suite of N-body simulations. We model the halo bispectrum at tree-level, including all contributions linear and quadratic in $f_{rm NL}$, and the halo power spectrum at 1-loop, including tree-level terms up to quadratic order in $f_{rm NL}$ and all loops induced by local PNG linear in $f_{rm NL}$. Keeping the cosmological parameters fixed, we examine the effect of informative priors on the linear non-Gaussian bias parameter on the statistical inference of $f_{rm NL}$. A conservative analysisof the combined power spectrum and bispectrum, in which only loose priors are imposed and all parameters are marginalized over, can improve the constraint on $f_{rm NL}$ by more than a factor of 5 relative to the power spectrum-only measurement. Imposing a strong prior on $b_phi$, or assuming bias relations for both $b_phi$ and $b_{phidelta}$ (motivated by a universal mass function assumption), improves the constraints further by a factor of few. In this case, however, we find a significant systematic shift in the inferred value of $f_{rm NL}$ if the same range of wavenumber is used. Likewise, a Poisson noise assumption can lead to significant systematics, and it is thus essential to leave all the stochastic amplitudes free.