No Arabic abstract
We present the completion of a data analysis pipeline that self-consistently separates global 21-cm signals from large systematics using a pattern recognition technique. In the first paper of this series, we obtain optimal basis vectors from signal and foreground training sets to linearly fit both components with the minimal number of terms that best extracts the signal given its overlap with the foreground. In this second paper, we utilize the spectral constraints derived in the first paper to calculate the full posterior probability distribution of any signal parameter space of choice. The spectral fit provides the starting point for a Markov Chain Monte Carlo (MCMC) engine that samples the signal without traversing the foreground parameter space. At each MCMC step, we marginalize over the weights of all linear foreground modes and suppress those with unimportant variations by applying priors gleaned from the training set. This method drastically reduces the number of MCMC parameters, augmenting the efficiency of exploration, circumvents the need for selecting a minimal number of foreground modes, and allows the complexity of the foreground model to be greatly increased to simultaneously describe many observed spectra without requiring extra MCMC parameters. Using two nonlinear signal models, one based on EDGES observations and the other on phenomenological frequencies and temperatures of theoretically expected extrema, we demonstrate the success of this methodology by recovering the input parameters from multiple randomly simulated signals at low radio frequencies (10-200 MHz), while rigorously accounting for realistically modeled beam-weighted foregrounds.
When using valid foreground and signal models, the uncertainties on extracted signals in global 21-cm signal experiments depend principally on the overlap between signal and foreground models. In this paper, we investigate two strategies for decreasing this overlap: (i) utilizing time dependence by fitting multiple drift-scan spectra simultaneously and (ii) measuring all four Stokes parameters instead of only the total power, Stokes I. Although measuring polarization requires different instruments than are used in most existing experiments, all existing experiments can utilize drift-scan measurements merely by averaging their data differently. In order to evaluate the increase in constraining power from using these two techniques, we define a method for connecting Root-Mean-Square (RMS) uncertainties to probabilistic confidence levels. Employing simulations, we find that fitting only one total power spectrum leads to RMS uncertainties at the few K level, while fitting multiple time-binned, drift-scan spectra yields uncertainties at the $lesssim 10$ mK level. This significant improvement only appears if the spectra are modeled with one set of basis vectors, instead of using multiple sets of basis vectors that independently model each spectrum. Assuming that they are simulated accurately, measuring all four Stokes parameters also leads to lower uncertainties. These two strategies can be employed simultaneously and fitting multiple time bins of all four Stokes parameters yields the best precision measurements of the 21-cm signal, approaching the noise level in the data.
We report constraints on the global $21$ cm signal due to neutral hydrogen at redshifts $14.8 geq z geq 6.5$. We derive our constraints from low foreground observations of the average sky brightness spectrum conducted with the EDGES High-Band instrument between September $7$ and October $26$, $2015$. Observations were calibrated by accounting for the effects of antenna beam chromaticity, antenna and ground losses, signal reflections, and receiver parameters. We evaluate the consistency between the spectrum and phenomenological models for the global $21$ cm signal. For tanh-based representations of the ionization history during the epoch of reionization, we rule out, at $geq2sigma$ significance, models with duration of up to $Delta z = 1$ at $zapprox8.5$ and higher than $Delta z = 0.4$ across most of the observed redshift range under the usual assumption that the $21$ cm spin temperature is much larger than the temperature of the cosmic microwave background (CMB) during reionization. We also investigate a `cold IGM scenario that assumes perfect Ly$alpha$ coupling of the $21$ cm spin temperature to the temperature of the intergalactic medium (IGM), but that the IGM is not heated by early stars or stellar remants. Under this assumption, we reject tanh-based reionization models of duration $Delta z lesssim 2$ over most of the observed redshift range. Finally, we explore and reject a broad range of Gaussian models for the $21$ cm absorption feature expected in the First Light era. As an example, we reject $100$ mK Gaussians with duration (full width at half maximum) $Delta z leq 4$ over the range $14.2geq zgeq 6.5$ at $geq2sigma$ significance.
The 21-cm signal of neutral hydrogen is a sensitive probe of the Epoch of Reionization (EoR) and Cosmic Dawn. Currently operating radio telescopes have ushered in a data-driven era of 21-cm cosmology, providing the first constraints on the astrophysical properties of sources that drive this signal. However, extracting astrophysical information from the data is highly non-trivial and requires the rapid generation of theoretical templates over a wide range of astrophysical parameters. To this end emulators are often employed, with previous efforts focused on predicting the power spectrum. In this work we introduce 21cmGEM - the first emulator of the global 21-cm signal from Cosmic Dawn and the EoR. The smoothness of the output signal is guaranteed by design. We train neural networks to predict the cosmological signal using a database of ~30,000 simulated signals which were created by varying seven astrophysical parameters: the star formation efficiency and the minimal mass of star-forming halos; the efficiency of the first X-ray sources and their spectrum parameterized by spectral index and the low energy cutoff; the mean free path of ionizing photons and the CMB optical depth. We test the performance with a set of ~2,000 simulated signals, showing that the relative error in the prediction has an r.m.s. of 0.0159. The algorithm is efficient, with a running time per parameter set of 0.16 sec. Finally, we use the database of models to check the robustness of relations between the features of the global signal and the astrophysical parameters that we previously reported.
A number of experiments are set to measure the 21-cm signal of neutral hydrogen from the Epoch of Reionization (EoR). The common denominator of these experiments are the large data sets produced, contaminated by various instrumental effects, ionospheric distortions, RFI and strong Galactic and extragalactic foregrounds. In this paper, the first in a series, we present the Data Model that will be the basis of the signal analysis for the LOFAR (Low Frequency Array) EoR Key Science Project (LOFAR EoR KSP). Using this data model we simulate realistic visibility data sets over a wide frequency band, taking properly into account all currently known instrumental corruptions (e.g. direction-dependent gains, complex gains, polarization effects, noise, etc). We then apply primary calibration errors to the data in a statistical sense, assuming that the calibration errors are random Gaussian variates at a level consistent with our current knowledge based on observations with the LOFAR Core Station 1. Our aim is to demonstrate how the systematics of an interferometric measurement affect the quality of the calibrated data, how errors correlate and propagate, and in the long run how this can lead to new calibration strategies. We present results of these simulations and the inversion process and extraction procedure. We also discuss some general properties of the coherency matrix and Jones formalism that might prove useful in solving the calibration problem of aperture synthesis arrays. We conclude that even in the presence of realistic noise and instrumental errors, the statistical signature of the EoR signal can be detected by LOFAR with relatively small errors. A detailed study of the statistical properties of our data model and more complex instrumental models will be considered in the future.
One approach to extracting the global 21-cm signal from total-power measurements at low radio frequencies is to parametrize the different contributions to the data and then fit for these parameters. We examine parametrizations of the 21-cm signal itself, and propose one based on modelling the Lyman-alpha background, IGM temperature and hydrogen ionized fraction using tanh functions. This captures the shape of the signal from a physical modelling code better than an earlier parametrization based on interpolating between maxima and minima of the signal, and imposes a greater level of physical plausibility. This allows less biased constraints on the turning points of the signal, even though these are not explicitly fit for. Biases can also be alleviated by discarding information which is less robustly described by the parametrization, for example by ignoring detailed shape information coming from the covariances between turning points or from the high-frequency parts of the signal, or by marginalizing over the high-frequency parts of the signal by fitting a more complex foreground model. The fits are sufficiently accurate to be usable for experiments gathering 1000 h of data, though in this case it may be important to choose observing windows which do not include the brightest areas of the foregrounds. Our assumption of pointed, single-antenna observations and very broad-band fitting makes these results particularly applicable to experiments such as the Dark Ages Radio Explorer, which would study the global 21-cm signal from the clean environment of a low lunar orbit, taking data from the far side.