No Arabic abstract
Based on the SDSS and SDSS-WISE quasar datasets, we put forward two schemes to estimate the photometric redshifts of quasars. Our schemes are based on the idea that the samples are firstly classified into subsamples by a classifier and then photometric redshift estimation of different subsamples is performed by a regressor. Random Forest is adopted as the core algorithm of the classifiers, while Random Forest and kNN are applied as the key algorithms of regressors. The samples are divided into two subsamples and four subsamples depending on the redshift distribution. The performance based on different samples, different algorithms and different schemes are compared. The experimental results indicate that the accuracy of photometric redshift estimation for the two schemes generally improve to some extent compared to the original scheme in terms of the percents in frac{|Delta z|}{1+z_{i}}<0.1 and frac{|Delta z|}{1+z_{i}}<0.2 and mean absolute error. Only given the running speed, kNN shows its superiority to Random Forest. The performance of Random Forest is a little better than or comparable to that of kNN with the two datasets. The accuracy based on the SDSS-WISE sample outperforms that based on the SDSS sample no matter by kNN or by Random Forest. More information from more bands is considered and helpful to improve the accuracy of photometric redshift estimation. Evidently it can be found that our strategy to estimate photometric redshift is applicable and may be applied to other datasets or other kinds of objects. Only talking about the percent in frac{|Delta z|}{1+z_{i}}<0.3, there is still large room for further improvement in the photometric redshift estimation.
Machine learning (ML) is a standard approach for estimating the redshifts of galaxies when only photometric information is available. ML photo-z solutions have traditionally ignored the morphological information available in galaxy images or partly included it in the form of hand-crafted features, with mixed results. We train a morphology-aware photometric redshift machine using modern deep learning tools. It uses a custom architecture that jointly trains on galaxy fluxes, colors and images. Galaxy-integrated quantities are fed to a Multi-Layer Perceptron (MLP) branch while images are fed to a convolutional (convnet) branch that can learn relevant morphological features. This split MLP-convnet architecture, which aims to disentangle strong photometric features from comparatively weak morphological ones, proves important for strong performance: a regular convnet-only architecture, while exposed to all available photometric information in images, delivers comparatively poor performance. We present a cross-validated MLP-convnet model trained on 130,000 SDSS-DR12 galaxies that outperforms a hyperoptimized Gradient Boosting solution (hyperopt+XGBoost), as well as the equivalent MLP-only architecture, on the redshift bias metric. The 4-fold cross-validated MLP-convnet model achieves a bias $delta z / (1+z) =-0.70 pm 1 times 10^{-3} $, approaching the performance of a reference ANNZ2 ensemble of 100 distinct models trained on a comparable dataset. The relative performance of the morphology-aware and morphology-blind models indicates that galaxy morphology does improve ML-based photometric redshift estimation.
The observing strategy of a galaxy survey influences the degree to which its resulting data can be used to accomplish any science goal. LSST is thus seeking metrics of observing strategies for multiple science cases in order to optimally choose a cadence. Photometric redshifts are essential for many extragalactic science applications of LSSTs data, including but not limited to cosmology, but there are few metrics available, and they are not straightforwardly integrated with metrics of other cadence-dependent quantities that may influence any given use case. We propose a metric for observing strategy optimization based on the potentially recoverable mutual information about redshift from a photometric sample under the constraints of a realistic observing strategy. We demonstrate a tractable estimation of a variational lower bound of this mutual information implemented in a public code using conditional normalizing flows. By comparing the recoverable redshift information across observing strategies, we can distinguish between those that preclude robust redshift constraints and those whose data will preserve more redshift information, to be generically utilized in a downstream analysis. We recommend the use of this versatile metric to observing strategy optimization for redshift-dependent extragalactic use cases, including but not limited to cosmology, as well as any other science applications for which photometry may be modeled from true parameter values beyond redshift.
The wavelength dependence of atmospheric refraction causes differential chromatic refraction (DCR), whereby objects imaged at different optical/UV wavelengths are observed at slightly different positions in the plane of the detector. Strong spectral features induce changes in the effective wavelengths of broad-band filters that are capable of producing significant positional offsets with respect to standard DCR corrections. We examine such offsets for broad-emission-line (type 1) quasars from the Sloan Digital Sky Survey (SDSS) spanning 0<z<5 and an airmass range of 1.0 to 1.8. These offsets are in good agreement with those predicted by convolving a composite quasar spectrum with the SDSS bandpasses as a function of redshift and airmass. This astrometric information can be used to break degeneracies in photometric redshifts of quasars (or other emission-line sources) and, for extreme cases, may be suitable for determining astrometric redshifts. On the SDSSs southern equatorial stripe, where it is possible to average many multi-epoch measurements, more than 60% of quasars have emission-line-induced astrometric offsets larger than the SDSSs relative astrometric errors of 25-35 mas. Folding these astrometric offsets into photometric redshift estimates yields an improvement of 9% within Delta z+/-0.1. Future multi-epoch synoptic surveys such as LSST and Pan-STARRS could benefit from intentionally making ~10 observations at relatively high airmass (AM~1.4) in order to improve their photometric redshifts for quasars.
A variety of fundamental astrophysical science topics require the determination of very accurate photometric redshifts (photo-zs). A wide plethora of methods have been developed, based either on template models fitting or on empirical explorations of the photometric parameter space. Machine learning based techniques are not explicitly dependent on the physical priors and able to produce accurate photo-z estimations within the photometric ranges derived from the spectroscopic training set. These estimates, however, are not easy to characterize in terms of a photo-z Probability Density Function (PDF), due to the fact that the analytical relation mapping the photometric parameters onto the redshift space is virtually unknown. We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method designed to provide a reliable PDF of the error distribution for empirical techniques. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine learning model chosen to predict photo-zs. We present a summary of results on SDSS-DR9 galaxy data, used also to perform a direct comparison with PDFs obtained by the Le Phare SED template fitting. We show that METAPHOR is capable to estimate the precision and reliability of photometric redshifts obtained with three different self-adaptive techniques, i.e. MLPQNA, Random Forest and the standard K-Nearest Neighbors models.
In this paper we apply ideas from information theory to create a method for the design of optimal filters for photometric redshift estimation. We show the method applied to a series of simple example filters in order to motivate an intuition for how photometric redshift estimators respond to the properties of photometric passbands. We then design a realistic set of six filters covering optical wavelengths that optimize photometric redshifts for $z <= 2.3$ and $i < 25.3$. We create a simulated catalog for these optimal filters and use our filters with a photometric redshift estimation code to show that we can improve the standard deviation of the photometric redshift error by 7.1% overall and improve outliers 9.9% over the standard filters proposed for the Large Synoptic Survey Telescope (LSST). We compare features of our optimal filters to LSST and find that the LSST filters incorporate key features for optimal photometric redshift estimation. Finally, we describe how information theory can be applied to a range of optimization problems in astronomy.