No Arabic abstract
We describe the first major public data release from cosmological simulations carried out with Argonnes HACC code. This initial release covers a range of datasets from large gravity-only simulations. The data products include halo information for multiple redshifts, down-sampled particles, and lightcone outputs. We provide data from two very large LCDM simulations as well as beyond-LCDM simulations spanning eleven w0-wa cosmologies. Our release platform uses Petrel, a research data service, located at the Argonne Leadership Computing Facility. Petrel offers fast data transfer mechanisms and authentication via Globus, enabling simple and efficient access to stored datasets. Easy browsing of the available data products is provided via a web portal that allows the user to navigate simulation products efficiently. The data hub will be extended by adding more types of data products and by enabling computational capabilities to allow direct interactions with simulation results.
The Dark Sky Simulations are an ongoing series of cosmological N-body simulations designed to provide a quantitative and accessible model of the evolution of the large-scale Universe. Such models are essential for many aspects of the study of dark matter and dark energy, since we lack a sufficiently accurate analytic model of non-linear gravitational clustering. In July 2014, we made available to the general community our early data release, consisting of over 55 Terabytes of simulation data products, including our largest simulation to date, which used $1.07 times 10^{12}~(10240^3)$ particles in a volume $8h^{-1}mathrm{Gpc}$ across. Our simulations were performed with 2HOT, a purely tree-based adaptive N-body method, running on 200,000 processors of the Titan supercomputer, with data analysis enabled by yt. We provide an overview of the derived halo catalogs, mass function, power spectra and light cone data. We show self-consistency in the mass function and mass power spectrum at the 1% level over a range of more than 1000 in particle mass. We also present a novel method to distribute and access very large datasets, based on an abstraction of the World Wide Web (WWW) as a file system, remote memory-mapped file access semantics, and a space-filling curve index. This method has been implemented for our data release, and provides a means to not only query stored results such as halo catalogs, but also to design and deploy new analysis techniques on large distributed datasets.
We present the online MultiDark Database -- a Virtual Observatory-oriented, relational database for hosting various cosmological simulations. The data is accessible via an SQL (Structured Query Language) query interface, which also allows users to directly pose scientific questions, as shown in a number of examples in this paper. Further examples for the usage of the database are given in its extensive online documentation (www.multidark.org). The database is based on the same technology as the Millennium Database, a fact that will greatly facilitate the usage of both suites of cosmological simulations. The first release of the MultiDark Database hosts two 8.6 billion particle cosmological N-body simulations: the Bolshoi (250/h Mpc simulation box, 1/h kpc resolution) and MultiDark Run1 simulation (MDR1, or BigBolshoi, 1000/h Mpc simulation box, 7/h kpc resolution). The extraction methods for halos/subhalos from the raw simulation data, and how this data is structured in the database are explained in this paper. With the first data release, users get full access to halo/subhalo catalogs, various profiles of the halos at redshifts z=0-15, and raw dark matter data for one time-step of the Bolshoi and four time-steps of the MultiDark simulation. Later releases will also include galaxy mock catalogs and additional merging trees for both simulations as well as new large volume simulations with high resolution. This project is further proof of the viability to store and present complex data using relational database technology. We encourage other simulators to publish their results in a similar manner.
We present the first full release of a survey of the 150 MHz radio sky, observed with the Giant Metrewave Radio Telescope between April 2010 and March 2012 as part of the TGSS project. Aimed at producing a reliable compact source survey, our automated data reduction pipeline efficiently processed more than 2000 hours of observations with minimal human interaction. Through application of innovative techniques such as image-based flagging, direction-dependent calibration of ionospheric phase errors, correcting for systematic offsets in antenna pointing, and improving the primary beam model, we created good quality images for over 95 percent of the 5336 pointings. Our data release covers 36,900 square degrees (or 3.6 pi steradians) of the sky between -53 deg and +90 deg DEC, which is 90 percent of the total sky. The majority of pointing images have a background RMS noise below 5 mJy/beam with an approximate resolution of 25 x 25 (or 25 x 25 / cos (DEC - 19 deg) for pointings south of 19 deg DEC). We have produced a catalog of 0.62 Million radio sources derived from an initial, high reliability source extraction at the 7 sigma level. For the bulk of the survey, the measured overall astrometric accuracy is better than 2 in RA and DEC, while the flux density accuracy is estimated at ~10 percent. Within the scope of the TGSS ADR project, the source catalog as well as 5336 mosaic images (5 deg x 5 deg) and an image cutout service, are made publicly available online as a service to the astronomical community. Next to enabling a wide range of different scientific investigations, we anticipate that these survey products provide a solid reference for various new low-frequency radio aperture array telescopes (LOFAR, LWA, MWA, SKA-low), and can play an important role in characterizing the EoR foreground. The TGSS ADR project aims at continuously improving the quality of the survey data products.
In preparation for cosmological analyses of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST), the LSST Dark Energy Science Collaboration (LSST DESC) has created a 300 deg$^2$ simulated survey as part of an effort called Data Challenge 2 (DC2). The DC2 simulated sky survey, in six optical bands with observations following a reference LSST observing cadence, was processed with the LSST Science Pipelines (19.0.0). In this Note, we describe the public data release of the resulting object catalogs for the coadded images of five years of simulated observations along with associated truth catalogs. We include a brief description of the major features of the available data sets. To enable convenient access to the data products, we have developed a web portal connected to Globus data services. We describe how to access the data and provide example Jupyter Notebooks in Python to aid first interactions with the data. We welcome feedback and questions about the data release via a GitHub repository.
Constraining neutrino mass remains an elusive challenge in modern physics. Precision measurements are expected from several upcoming cosmological probes of large-scale structure. Achieving this goal relies on an equal level of precision from theoretical predictions of neutrino clustering. Numerical simulations of the non-linear evolution of cold dark matter and neutrinos play a pivotal role in this process. We incorporate neutrinos into the cosmological N-body code CUBEP3M and discuss the challenges associated with pushing to the extreme scales demanded by the neutrino problem. We highlight code optimizations made to exploit modern high performance computing architectures and present a novel method of data compression that reduces the phase-space particle footprint from 24 bytes in single precision to roughly 9 bytes. We scale the neutrino problem to the Tianhe-2 supercomputer and provide details of our production run, named TianNu, which uses 86% of the machine (13,824 compute nodes). With a total of 2.97 trillion particles, TianNu is currently the worlds largest cosmological N-body simulation and improves upon previous neutrino simulations by two orders of magnitude in scale. We finish with a discussion of the unanticipated computational challenges that were encountered during the TianNu runtime.