Constraining Cosmology with Big Data Statistics of Cosmological Graphs


Abstract in English

By utilizing large-scale graph analytic tools implemented in the modern Big Data platform, Apache Spark, we investigate the topological structure of gravitational clustering in five different universes produced by cosmological $N$-body simulations with varying parameters: (1) a WMAP 5-year compatible $Lambda$CDM cosmology, (2) two different dark energy equation of state variants, and (3) two different cosmic matter density variants. For the Big Data calculations, we use a custom build of stand-alone Spark/Hadoop cluster at Korea Institute for Advanced Study (KIAS) and Dataproc Compute Engine in Google Cloud Platform (GCP) with the sample size ranging from 7 millions to 200 millions. We find that among the many possible graph-topological measures, three simple ones: (1) the average of number of neighbors (the so-called average vertex degree) $alpha$, (2) closed-to-connected triple fraction (the so-called transitivity) $tau_Delta$, and (3) the cumulative number density $n_{sge5}$ of subcomponents with connected component size $s ge 5$, can effectively discriminate among the five model universes. Since these graph-topological measures are in direct relation with the usual $n$-points correlation functions of the cosmic density field, graph-topological statistics powered by Big Data computational infrastructure opens a new, intuitive, and computationally efficient window into the dark Universe.

Download