ترغب بنشر مسار تعليمي؟ اضغط هنا

Outlier Prediction and Training Set Modification to Reduce Catastrophic Outlier Redshift Estimates in Large-Scale Surveys

120   0   0.0 ( 0 )
 نشر من قبل Jack Singal
 تاريخ النشر 2019
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

We present results of using individual galaxies probability distribution over redshift as a method of identifying potential catastrophic outliers in empirical photometric redshift estimation. In the course of developing this approach we develop a method of modification of the redshift distribution of training sets to improve both the baseline accuracy of high redshift (z>1.5) estimation as well as catastrophic outlier mitigation. We demonstrate these using two real test data sets and one simulated test data set spanning a wide redshift range (0<z<4). Results presented here inform an example `prescription that can be applied as a realistic photometric redshift estimation scenario for a hypothetical large-scale survey. We find that with appropriate optimization, we can identify a significant percentage (>30%) of catastrophic outlier galaxies while simultaneously incorrectly flagging only a small percentage (<7% and in many cases <3%) of non-outlier galaxies as catastrophic outliers. We find also that our training set redshift distribution modification results in a significant (>10) percentage point decrease of outlier galaxies for z>1.5 with only a small (<3) percentage point increase of outlier galaxies for z<1.5 compared to the unmodified training set. In addition, we find that this modification can in some cases cause a significant (~20) percentage point decrease of galaxies which are non-outliers but which have been incorrectly identified as outliers, while in other cases cause only a small (<1) percentage increase in this metric.

قيم البحث

اقرأ أيضاً

78 - Evan Jones , J. Singal 2017
We present results of using individual galaxies redshift probability information derived from a photometric redshift (photo-z) algorithm, SPIDERz, to identify potential catastrophic outliers in photometric redshift determinations. By using two test d ata sets comprised of COSMOS multi-band photometry spanning a wide redshift range (0<z<4) matched with reliable spectroscopic or other redshift determinations we explore the efficacy of a novel method to flag potential catastrophic outliers in an analysis which relies on accurate photometric redshifts. SPIDERz is a custom support vector machine classification algorithm for photo-z analysis that naturally outputs a distribution of redshift probability information for each galaxy in addition to a discrete most probable photo-z value. By applying an analytic technique with flagging criteria to identify the presence of probability distribution features characteristic of catastrophic outlier photo-z estimates, such as multiple redshift probability peaks separated by substantial redshift distances, we can flag potential catastrophic outliers in photo-z determinations. We find that our proposed method can correctly flag large fractions (>50%) of the catastrophic outlier galaxies, while only flagging a small fraction (<5%) of the total non-outlier galaxies, depending on parameter choices. The fraction of non-outlier galaxies flagged varies significantly with redshift and magnitude, however. We examine the performance of this strategy in photo-z determinations using a range of flagging parameter values. These results could potentially be useful for utilization of photometric redshifts in future large scale surveys where catastrophic outliers are particularly detrimental to the science goals.
70 - Adriano Agnello 2017
I describe two novel techniques originally devised to select strongly lensed quasar candidates in wide-field surveys. The first relies on outlier selection in optical and mid-infrared magnitude space; the second combines mid-infrared colour selection with GAIA spatial resolution, to identify multiplets of objects with quasar-like colours. Both methods have already been applied successfully to the SDSS, ATLAS and DES footprints: besides recovering known lenses from previous searches, they have led to new discoveries, including quadruply lensed quasars, which are rare within the rare-object class of quasar lenses. As a serendipitous by-product, at least four candidate Galactic streams in the South have been identified among foreground contaminants. There is considerable scope for tailoring the WISE-GAIA multiplet search to stellar-like objects, instead of quasar-like, and to automatically detect Galactic streams.
This work addresses the outlier removal problem in large-scale global structure-from-motion. In such applications, global outlier removal is very useful to mitigate the deterioration caused by mismatches in the feature point matching step. Unlike exi sting outlier removal methods, we exploit the structure in multiview geometry problems to propose a dimension reduced formulation, based on which two methods have been developed. The first method considers a convex relaxed $ell_1$ minimization and is solved by a single linear programming (LP), whilst the second one approximately solves the ideal $ell_0$ minimization by an iteratively reweighted method. The dimension reduction results in a significant speedup of the new algorithms. Further, the iteratively reweighted method can significantly reduce the possibility of removing true inliers. Realistic multiview reconstruction experiments demonstrated that, compared with state-of-the-art algorithms, the new algorithms are much more efficient and meanwhile can give improved solution. Matlab code for reproducing the results is available at textit{https://github.com/FWen/OUTLR.git}.
The proliferation of Web services makes it difficult for users to select the most appropriate one among numerous functionally identical or similar service candidates. Quality-of-Service (QoS) describes the non-functional characteristics of Web servic es, and it has become the key differentiator for service selection. However, users cannot invoke all Web services to obtain the corresponding QoS values due to high time cost and huge resource overhead. Thus, it is essential to predict unknown QoS values. Although various QoS prediction methods have been proposed, few of them have taken outliers into consideration, which may dramatically degrade the prediction performance. To overcome this limitation, we propose an outlier-resilient QoS prediction method in this paper. Our method utilizes Cauchy loss to measure the discrepancy between the observed QoS values and the predicted ones. Owing to the robustness of Cauchy loss, our method is resilient to outliers. We further extend our method to provide time-aware QoS prediction results by taking the temporal information into consideration. Finally, we conduct extensive experiments on both static and dynamic datasets. The results demonstrate that our method is able to achieve better performance than state-of-the-art baseline methods.
RF devices can be identified by unique imperfections embedded in the signals they transmit called RF fingerprints. The closed set classification of such devices, where the identification must be made among an authorized set of transmitters, has been well explored. However, the much more difficult problem of open set classification, where the classifier needs to reject unauthorized transmitters while recognizing authorized transmitters, has only been recently visited. So far, efforts at open set classification have largely relied on the utilization of signal samples captured from a known set of unauthorized transmitters to aid the classifier learn unauthorized transmitter fingerprints. Since acquiring new transmitters to use as known transmitters is highly expensive, we propose to use generative deep learning methods to emulate unauthorized signal samples for the augmentation of training datasets. We develop two different data augmentation techniques, one that exploits a limited number of known unauthorized transmitters and the other that does not require any unauthorized transmitters. Experiments conducted on a dataset captured from a WiFi testbed indicate that data augmentation allows for significant increases in open set classification accuracy, especially when the authorized set is small.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا