No Arabic abstract
We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep photometry in the $griz$ bands, and the second reflecting the photometric scatter present in the main DES survey, with carefully constructed representative training data in each case. We validate our joint PDFs for $10,699$ test galaxies by utilizing the copula probability integral transform and the Kendall distribution function, and their univariate counterparts to validate the marginals. Benchmarked against a basic set-up of the template-fitting code BAGPIPES, our ML-based method outperforms template fitting on all of our predefined performance metrics. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under $6$ min with consumer computer hardware. Such speed enables PDFs to be derived in real time within analysis codes, solving potential storage issues. As part of this work we have developed GALPRO, a highly intuitive and efficient Python package to rapidly generate multivariate PDFs on-the-fly. GALPRO is documented and available for researchers to use in their cosmology and galaxy evolution studies.
We present results of using individual galaxies redshift probability information derived from a photometric redshift (photo-z) algorithm, SPIDERz, to identify potential catastrophic outliers in photometric redshift determinations. By using two test data sets comprised of COSMOS multi-band photometry spanning a wide redshift range (0<z<4) matched with reliable spectroscopic or other redshift determinations we explore the efficacy of a novel method to flag potential catastrophic outliers in an analysis which relies on accurate photometric redshifts. SPIDERz is a custom support vector machine classification algorithm for photo-z analysis that naturally outputs a distribution of redshift probability information for each galaxy in addition to a discrete most probable photo-z value. By applying an analytic technique with flagging criteria to identify the presence of probability distribution features characteristic of catastrophic outlier photo-z estimates, such as multiple redshift probability peaks separated by substantial redshift distances, we can flag potential catastrophic outliers in photo-z determinations. We find that our proposed method can correctly flag large fractions (>50%) of the catastrophic outlier galaxies, while only flagging a small fraction (<5%) of the total non-outlier galaxies, depending on parameter choices. The fraction of non-outlier galaxies flagged varies significantly with redshift and magnitude, however. We examine the performance of this strategy in photo-z determinations using a range of flagging parameter values. These results could potentially be useful for utilization of photometric redshifts in future large scale surveys where catastrophic outliers are particularly detrimental to the science goals.
At the highest redshifts, z>6, several tens of luminous quasars have been detected. The search for fainter AGN, in deep X-ray surveys, has proven less successful, with few candidates to date. An extrapolation of the relationship between black hole (BH) and bulge mass would predict that the sample of z>6 galaxies host relatively massive BHs (>1e6 Msun), if one assumes that total stellar mass is a good proxy for bulge mass. At least a few of these BHs should be luminous enough to be detectable in the 4Ms CDFS. The relation between BH and stellar mass defined by local moderate-luminosity AGN in low-mass galaxies, however, has a normalization that is lower by approximately an order of magnitude compared to the BH-bulge mass relation. We explore how this scaling changes the interpretation of AGN in the high-z Universe. Despite large uncertainties, driven by those in the stellar mass function, and in the extrapolation of local relations, one can explain the current non-detection of moderate-luminosity AGN in Lyman Break Galaxies if galaxies below 1e11 Msun are characterized by the low-normalization scaling, and, even more so, if their Eddington ratio is also typical of moderate-luminosity AGN rather than luminous quasars. AGN being missed by X-ray searches due to obscuration or instrinsic X-ray weakness also remain a possibility.
Observations suggest that satellite quenching plays a major role in the build-up of passive, low-mass galaxies at late cosmic times. Studies of low-mass satellites, however, are limited by the ability to robustly characterize the local environment and star-formation activity of faint systems. In an effort to overcome the limitations of existing data sets, we utilize deep photometry in Stripe 82 of the Sloan Digital Sky Survey, in conjunction with a neural network classification scheme, to study the suppression of star formation in low-mass satellite galaxies in the local Universe. Using a statistically-driven approach, we are able to push beyond the limits of existing spectroscopic data sets, measuring the satellite quenched fraction down to satellite stellar masses of ${sim}10^7~{rm M}_{odot}$ in group environments (${M}_{rm{halo}} = 10^{13-14}~h^{-1}~{rm M}_{odot}$). At high satellite stellar masses ($gtrsim 10^{10}~{rm M}_{odot}$), our analysis successfully reproduces existing measurements of the quenched fraction based on spectroscopic samples. Pushing to lower masses, we find that the fraction of passive satellites increases, potentially signaling a change in the dominant quenching mechanism at ${M}_{star} sim 10^{9}~{rm M}_{odot}$. Similar to the results of previous studies of the Local Group, this increase in the quenched fraction at low satellite masses may correspond to an increase in the efficacy of ram-pressure stripping as a quenching mechanism in groups.
We study the components of cool and warm/hot gas in the circumgalactic medium (CGM) of simulated galaxies and address the relative production of OVI by photoionization versus collisional ionization, as a function of halo mass, redshift, and distance from the galaxy halo center. This is done utilizing two different suites of zoom-in hydro-cosmological simulations, VELA (6 halos; $z>1$) and NIHAO (18 halos; to $z=0$), which provide a broad theoretical basis because they use different codes and physical recipes for star formation and feedback. In all halos studied in this work, we find that collisional ionization by thermal electrons dominates at high redshift, while photoionization of cool or warm gas by the metagalactic radiation takes over near $zsim2$. In halos of $sim 10^{12}M_{odot}$ and above, collisions become important again at $z<0.5$, while photoionization remains significant down to $z=0$ for less massive halos. In halos with $M_{textrm v}>3times10^{11}~M_{odot}$, at $zsim 0$ most of the photoionized OVI is in a warm, not cool, gas phase ($Tlesssim 3times 10^5$~K). We also find that collisions are dominant in the central regions of halos, while photoionization is more significant at the outskirts, around $R_{textrm v}$, even in massive halos. This too may be explained by the presence of warm gas or, in lower mass halos, by cool gas inflows.
The application of Bayesian techniques to astronomical data is generally non-trivial because the fitting parameters can be strongly degenerated and the formal uncertainties are themselves uncertain. An example is provided by the contradictory claims over the presence or absence of a universal acceleration scale (g$_dagger$) in galaxies based on Bayesian fits to rotation curves. To illustrate the situation, we present an analysis in which the Newtonian gravitational constant $G_N$ is allowed to vary from galaxy to galaxy when fitting rotation curves from the SPARC database, in analogy to $g_{dagger}$ in the recently debated Bayesian analyses. When imposing flat priors on $G_N$, we obtain a wide distribution of $G_N$ which, taken at face value, would rule out $G_N$ as a universal constant with high statistical confidence. However, imposing an empirically motivated log-normal prior returns a virtually constant $G_N$ with no sacrifice in fit quality. This implies that the inference of a variable $G_N$ (or g$_{dagger}$) is the result of the combined effect of parameter degeneracies and unavoidable uncertainties in the error model. When these effects are taken into account, the SPARC data are consistent with a constant $G_{rm N}$ (and constant $g_dagger$).