No Arabic abstract
We present a catalog of quasars and corresponding redshifts in the Kilo-Degree Survey (KiDS) Data Release 4. We trained machine learning (ML) models, using optical ugri and near-infrared ZYJHK_s bands, on objects known from Sloan Digital Sky Survey (SDSS) spectroscopy. We define inference subsets from the 45 million objects of the KiDS photometric data limited to 9-band detections. We show that projections of the high-dimensional feature space can be successfully used to investigate the estimations. The model creation employs two test subsets: randomly selected and the faintest objects, which allows to fit the bias versus variance trade-off. We tested three ML models: random forest (RF), XGBoost (XGB), and artificial neural network (ANN). We find that XGB is the most robust model for classification, while ANN performs the best for combined classification and redshift. The inference results are tested using number counts, Gaia parallaxes, and other quasar catalogs. Based on these tests, we derived the minimum classification probability which provides the best purity versus completeness trade-off: p(QSO_cand) > 0.9 for r < 22 and p(QSO_cand) > 0.98 for 22 < r < 23.5. We find 158,000 quasar candidates in the safe inference subset (r < 22) and an additional 185,000 candidates in the reliable extrapolation regime (22 < r < 23.5). Test-data purity equals 97% and completeness is 94%; the latter drops by 3% in the extrapolation to data fainter by one magnitude than the training set. The photometric redshifts were modeled with Gaussian uncertainties. The redshift error (mean and scatter) equals 0.01 +/- 0.1 in the safe subset and -0.0004 +/- 0.2 in the extrapolation, in a redshift range of 0.14 < z < 3.63. Our success of the extrapolation challenges the way that models are optimized and applied at the faint data end. The catalog is ready for cosmology and active galactic nucleus (AGN) studies.
We present a bright galaxy sample with accurate and precise photometric redshifts (photo-zs), selected using $ugriZYJHK_mathrm{s}$ photometry from the Kilo-Degree Survey (KiDS) Data Release 4 (DR4). The highly pure and complete dataset is flux-limited at $r<20$ mag, covers $sim1000$ deg$^2$, and contains about 1 million galaxies after artifact masking. We exploit the overlap with Galaxy And Mass Assembly (GAMA) spectroscopy as calibration to determine photo-zs with the supervised machine learning neural network algorithm implemented in the ANNz2 software. The photo-zs have mean error of $|langle delta z rangle| sim 5 times 10^{-4}$ and low scatter (scaled mean absolute deviation of $sim 0.018(1+z)$), both practically independent of the $r$-band magnitude and photo-z at $0.05 < z_mathrm{phot} < 0.5$. Combined with the 9-band photometry, these allow us to estimate robust absolute magnitudes and stellar masses for the full sample. As a demonstration of the usefulness of these data we split the dataset into red and blue galaxies, use them as lenses and measure the weak gravitational lensing signal around them for five stellar mass bins. We fit a halo model to these high-precision measurements to constrain the stellar-mass--halo-mass relations for blue and red galaxies. We find that for high stellar mass ($M_star>5times 10^{11} M_odot$), the red galaxies occupy dark matter halos that are much more massive than those occupied by blue galaxies with the same stellar mass. The data presented here are publicly released via the KiDS webpage at http://kids.strw.leidenuniv.nl/DR4/brightsample.php.
We present a catalog of quasars selected from broad-band photometric ugri data of the Kilo-Degree Survey Data Release 3 (KiDS DR3). The QSOs are identified by the random forest (RF) supervised machine learning model, trained on SDSS DR14 spectroscopic data. We first cleaned the input KiDS data from entries with excessively noisy, missing or otherwise problematic measurements. Applying a feature importance analysis, we then tune the algorithm and identify in the KiDS multiband catalog the 17 most useful features for the classification, namely magnitudes, colors, magnitude ratios, and the stellarity index. We used the t-SNE algorithm to map the multi-dimensional photometric data onto 2D planes and compare the coverage of the training and inference sets. We limited the inference set to r<22 to avoid extrapolation beyond the feature space covered by training, as the SDSS spectroscopic sample is considerably shallower than KiDS. This gives 3.4 million objects in the final inference sample, from which the random forest identified 190,000 quasar candidates. Accuracy of 97%, purity of 91%, and completeness of 87%, as derived from a test set extracted from SDSS and not used in the training, are confirmed by comparison with external spectroscopic and photometric QSO catalogs overlapping with the KiDS footprint. The robustness of our results is strengthened by number counts of the quasar candidates in the r band, as well as by their mid-infrared colors available from WISE. An analysis of parallaxes and proper motions of our QSO candidates found also in Gaia DR2 suggests that a probability cut of p(QSO)>0.8 is optimal for purity, whereas p(QSO)>0.7 is preferable for better completeness. Our study presents the first comprehensive quasar selection from deep high-quality KiDS data and will serve as the basis for versatile studies of the QSO population detected by this survey.
We present a machine-learning photometric redshift analysis of the Kilo-Degree Survey Data Release 3, using two neural-network based techniques: ANNz2 and MLPQNA. Despite limited coverage of spectroscopic training sets, these ML codes provide photo-zs of quality comparable to, if not better than, those from the BPZ code, at least up to zphot<0.9 and r<23.5. At the bright end of r<20, where very complete spectroscopic data overlapping with KiDS are available, the performance of the ML photo-zs clearly surpasses that of BPZ, currently the primary photo-z method for KiDS. Using the Galaxy And Mass Assembly (GAMA) spectroscopic survey as calibration, we furthermore study how photo-zs improve for bright sources when photometric parameters additional to magnitudes are included in the photo-z derivation, as well as when VIKING and WISE infrared bands are added. While the fiducial four-band ugri setup gives a photo-z bias $delta z=-2e-4$ and scatter $sigma_z<0.022$ at mean z = 0.23, combining magnitudes, colours, and galaxy sizes reduces the scatter by ~7% and the bias by an order of magnitude. Once the ugri and IR magnitudes are joined into 12-band photometry spanning up to 12 $mu$, the scatter decreases by more than 10% over the fiducial case. Finally, using the 12 bands together with optical colours and linear sizes gives $delta z<4e-5$ and $sigma_z<0.019$. This paper also serves as a reference for two public photo-z catalogues accompanying KiDS DR3, both obtained using the ANNz2 code. The first one, of general purpose, includes all the 39 million KiDS sources with four-band ugri measurements in DR3. The second dataset, optimized for low-redshift studies such as galaxy-galaxy lensing, is limited to r<20, and provides photo-zs of much better quality than in the full-depth case thanks to incorporating optical magnitudes, colours, and sizes in the GAMA-calibrated photo-z derivation.
We present a sample of luminous red-sequence galaxies to study the large-scale structure in the fourth data release of the Kilo-Degree Survey. The selected galaxies are defined by a red-sequence template, in the form of a data-driven model of the colour-magnitude relation conditioned on redshift. In this work, the red-sequence template is built using the broad-band optical+near infrared photometry of KiDS-VIKING and the overlapping spectroscopic data sets. The selection process involves estimating the red-sequence redshifts, assessing the purity of the sample, and estimating the underlying redshift distributions of redshift bins. After performing the selection, we mitigate the impact of survey properties on the observed number density of galaxies by assigning photometric weights to the galaxies. We measure the angular two-point correlation function of the red galaxies in four redshift bins, and constrain the large scale bias of our red-sequence sample assuming a fixed $Lambda$CDM cosmology. We find consistent linear biases for two luminosity-threshold samples (dense and luminous). We find that our constraints are well characterized by the passive evolution model.
The Kilo-Degree Survey (KiDS) is an ongoing optical wide-field imaging survey with the OmegaCAM camera at the VLT Survey Telescope. It aims to image 1500 square degrees in four filters (ugri). The core science driver is mapping the large-scale matter distribution in the Universe, using weak lensing shear and photometric redshift measurements. Further science cases include galaxy evolution, Milky Way structure, detection of high-redshift clusters, and finding rare sources such as strong lenses and quasars. Here we present the third public data release (DR3) and several associated data products, adding further area, homogenized photometric calibration, photometric redshifts and weak lensing shear measurements to the first two releases. A dedicated pipeline embedded in the Astro-WISE information system is used for the production of the main release. Modifications with respect to earlier releases are described in detail. Photometric redshifts have been derived using both Bayesian template fitting, and machine-learning techniques. For the weak lensing measurements, optimized procedures based on the THELI data reduction and lensfit shear measurement packages are used. In DR3 stacked ugri images, weight maps, masks, and source lists for 292 new survey tiles (~300 sq.deg) are made available. The multi-band catalogue, including homogenized photometry and photometric redshifts, covers the combined DR1, DR2 and DR3 footprint of 440 survey tiles (447 sq.deg). Limiting magnitudes are typically 24.3, 25.1, 24.9, 23.8 (5 sigma in a 2 arcsec aperture) in ugri, respectively, and the typical r-band PSF size is less than 0.7 arcsec. The photometric homogenization scheme ensures accurate colors and an absolute calibration stable to ~2% for gri and ~3% in u. Separately released are a weak lensing shear catalogue and photometric redshifts based on two different machine-learning techniques.