No Arabic abstract
The current role of data-driven science is constantly increasing its importance within Astrophysics, due to the huge amount of multi-wavelength data collected every day, characterized by complex and high-volume information requiring efficient and as much as possible automated exploration tools. Furthermore, to accomplish main and legacy science objectives of future or incoming large and deep survey projects, such as JWST, LSST and Euclid, a crucial role is played by an accurate estimation of photometric redshifts, whose knowledge would permit the detection and analysis of extended and peculiar sources by disentangling low-z from high-z sources and would contribute to solve the modern cosmological discrepancies. The recent photometric redshift data challenges, organized within several survey projects, like LSST and Euclid, pushed the exploitation of multi-wavelength and multi-dimensional data observed or ad hoc simulated to improve and optimize the photometric redshifts prediction and statistical characterization based on both SED template fitting and machine learning methodologies. But they also provided a new impetus in the investigation on hybrid and deep learning techniques, aimed at conjugating the positive peculiarities of different methodologies, thus optimizing the estimation accuracy and maximizing the photometric range coverage, particularly important in the high-z regime, where the spectroscopic ground truth is poorly available. In such a context we summarize what learned and proposed in more than a decade of research.
Astronomy has entered the big data era and Machine Learning based methods have found widespread use in a large variety of astronomical applications. This is demonstrated by the recent huge increase in the number of publications making use of this new approach. The usage of machine learning methods, however is still far from trivial and many problems still need to be solved. Using the evaluation of photometric redshifts as a case study, we outline the main problems and some ongoing efforts to solve them.
Photometric redshifts (photo-zs) provide an alternative way to estimate the distances of large samples of galaxies and are therefore crucial to a large variety of cosmological problems. Among the various methods proposed over the years, supervised machine learning (ML) methods capable to interpolate the knowledge gained by means of spectroscopical data have proven to be very effective. METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts) is a novel method designed to provide a reliable PDF (Probability density Function) of the error distribution of photometric redshifts predicted by ML methods. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine learning model chosen to predict photo-zs. After a short description of the software, we present a summary of results on public galaxy data (Sloan Digital Sky Survey - Data Release 9) and a comparison with a completely different method based on Spectral Energy Distribution (SED) template fitting.
We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method able to provide a reliable PDF for photometric galaxy redshifts estimated through empirical techniques. METAPHOR is a modular workflow, mainly based on the MLPQNA neural network as internal engine to derive photometric galaxy redshifts, but giving the possibility to easily replace MLPQNA with any other method to predict photo-zs and their PDF. We present here the results about a validation test of the workflow on the galaxies from SDSS-DR9, showing also the universality of the method by replacing MLPQNA with KNN and Random Forest models. The validation test include also a comparison with the PDFs derived from a traditional SED template fitting method (Le Phare).
We estimated photometric redshifts (zphot) for more than 1.1 million galaxies of the ESO Public Kilo-Degree Survey (KiDS) Data Release 2. KiDS is an optical wide-field imaging survey carried out with the VLT Survey Telescope (VST) and the OmegaCAM camera, which aims at tackling open questions in cosmology and galaxy evolution, such as the origin of dark energy and the channel of galaxy mass growth. We present a catalogue of photometric redshifts obtained using the Multi Layer Perceptron with Quasi Newton Algorithm (MLPQNA) model, provided within the framework of the DAta Mining and Exploration Web Application REsource (DAMEWARE). These photometric redshifts are based on a spectroscopic knowledge base which was obtained by merging spectroscopic datasets from GAMA (Galaxy And Mass Assembly) data release 2 and SDSS-III data release 9. The overall 1 sigma uncertainty on Delta z = (zspec - zphot) / (1+ zspec) is ~ 0.03, with a very small average bias of ~ 0.001, a NMAD of ~ 0.02 and a fraction of catastrophic outliers (| Delta z | > 0.15) of ~0.4%.
A variety of fundamental astrophysical science topics require the determination of very accurate photometric redshifts (photo-zs). A wide plethora of methods have been developed, based either on template models fitting or on empirical explorations of the photometric parameter space. Machine learning based techniques are not explicitly dependent on the physical priors and able to produce accurate photo-z estimations within the photometric ranges derived from the spectroscopic training set. These estimates, however, are not easy to characterize in terms of a photo-z Probability Density Function (PDF), due to the fact that the analytical relation mapping the photometric parameters onto the redshift space is virtually unknown. We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method designed to provide a reliable PDF of the error distribution for empirical techniques. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine learning model chosen to predict photo-zs. We present a summary of results on SDSS-DR9 galaxy data, used also to perform a direct comparison with PDFs obtained by the Le Phare SED template fitting. We show that METAPHOR is capable to estimate the precision and reliability of photometric redshifts obtained with three different self-adaptive techniques, i.e. MLPQNA, Random Forest and the standard K-Nearest Neighbors models.