Outlier Prediction and Training Set Modification to Reduce Catastrophic Outlier Redshift Estimates in Large-Scale Surveys

published by Jack Singal in 2019 in Physics and research's language is English Download

Abstract in English

We present results of using individual galaxies probability distribution over redshift as a method of identifying potential catastrophic outliers in empirical photometric redshift estimation. In the course of developing this approach we develop a method of modification of the redshift distribution of training sets to improve both the baseline accuracy of high redshift (z>1.5) estimation as well as catastrophic outlier mitigation. We demonstrate these using two real test data sets and one simulated test data set spanning a wide redshift range (0<z<4). Results presented here inform an example `prescription that can be applied as a realistic photometric redshift estimation scenario for a hypothetical large-scale survey. We find that with appropriate optimization, we can identify a significant percentage (>30%) of catastrophic outlier galaxies while simultaneously incorrectly flagging only a small percentage (<7% and in many cases <3%) of non-outlier galaxies as catastrophic outliers. We find also that our training set redshift distribution modification results in a significant (>10) percentage point decrease of outlier galaxies for z>1.5 with only a small (<3) percentage point increase of outlier galaxies for z<1.5 compared to the unmodified training set. In addition, we find that this modification can in some cases cause a significant (~20) percentage point decrease of galaxies which are non-outliers but which have been incorrectly identified as outliers, while in other cases cause only a small (<1) percentage increase in this metric.

Download