ترغب بنشر مسار تعليمي؟ اضغط هنا

Star-Galaxy Separation via Gaussian Processes with Model Reduction

62   0   0.0 ( 0 )
 نشر من قبل Imene Goumiri
 تاريخ النشر 2020
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Modern cosmological surveys such as the Hyper Suprime-Cam (HSC) survey produce a huge volume of low-resolution images of both distant galaxies and dim stars in our own galaxy. Being able to automatically classify these images is a long-standing problem in astronomy and critical to a number of different scientific analyses. Recently, the challenge of star-galaxy classification has been approached with Deep Neural Networks (DNNs), which are good at learning complex nonlinear embeddings. However, DNNs are known to overconfidently extrapolate on unseen data and require a large volume of training images that accurately capture the data distribution to be considered reliable. Gaussian Processes (GPs), which infer posterior distributions over functions and naturally quantify uncertainty, havent been a tool of choice for this task mainly because popular kernels exhibit limited expressivity on complex and high-dimensional data. In this paper, we present a novel approach to the star-galaxy separation problem that uses GPs and reap their benefits while solving many of the issues traditionally affecting them for classification of high-dimensional celestial image data. After an initial filtering of the raw data of star and galaxy image cutouts, we first reduce the dimensionality of the input images by using a Principal Components Analysis (PCA) before applying GPs using a simple Radial Basis Function (RBF) kernel on the reduced data. Using this method, we greatly improve the accuracy of the classification over a basic application of GPs while improving the computational efficiency and scalability of the method.



قيم البحث

اقرأ أيضاً

We discuss the statistical foundations of morphological star-galaxy separation. We show that many of the star-galaxy separation metrics in common use today (e.g. by SDSS or SExtractor) are closely related both to each other, and to the model odds rat io derived in a Bayesian framework by Sebok (1979). While the scaling of these algorithms with the noise properties of the sources varies, these differences do not strongly differentiate their performance. We construct a model of the performance of a star-galaxy separator in a realistic survey to understand the impact of observational signal-to-noise ratio (or equivalently, 5-sigma limiting depth) and seeing on classification performance. The model quantitatively demonstrates that, assuming realistic densities and angular sizes of stars and galaxies, 10% worse seeing can be compensated for by approximately 0.4 magnitudes deeper data to achieve the same star-galaxy classification performance. We discuss how to probabilistically combine multiple measurements, either of the same type (e.g., subsequent exposures), or differing types (e.g., multiple bandpasses), or differing methodologies (e.g., morphological and color-based classification). These methods are increasingly important for observations at faint magnitudes, where the rapidly rising number density of small galaxies makes star-galaxy classification a challenging problem. However, because of the significant role that the signal-to-noise ratio plays in resolving small galaxies, surveys with large-aperture telescopes, such as LSST, will continue to see improving star-galaxy separation as they push to these fainter magnitudes.
Context: It is crucial to develop a method for classifying objects detected in deep surveys at infrared wavelengths. We specifically need a method to separate galaxies from stars using only the infrared information to study the properties of galaxies , e.g., to estimate the angular correlation function, without introducing any additional bias. Aims. We aim to separate stars and galaxies in the data from the AKARI North Ecliptic Pole (NEP) Deep survey collected in nine AKARI / IRC bands from 2 to 24 {mu}m that cover the near- and mid-infrared wavelengths (hereafter NIR and MIR). We plan to estimate the correlation function for NIR and MIR galaxies from a sample selected according to our criteria in future research. Methods: We used support vector machines (SVM) to study the distribution of stars and galaxies in the AKARIs multicolor space. We defined the training samples of these objects by calculating their infrared stellarity parameter (sgc). We created the most efficient classifier and then tested it on the whole sample. We confirmed the developed separation with auxiliary optical data obtained by the Subaru telescope and by creating Euclidean normalized number count plots. Results: We obtain a 90% accuracy in pinpointing galaxies and 98% accuracy for stars in infrared multicolor space with the infrared SVM classifier. The source counts and comparison with the optical data (with a consistency of 65% for selecting stars and 96% for galaxies) confirm that our star/galaxy separation methods are reliable. Conclusions: The infrared classifier derived with the SVM method based on infrared sgc- selected training samples proves to be very efficient and accurate in selecting stars and galaxies in deep surveys at infrared wavelengths carried out without any previous target object selection.
Strong-lensing images provide a wealth of information both about the magnified source and about the dark matter distribution in the lens. Precision analyses of these images can be used to constrain the nature of dark matter. However, this requires hi gh-fidelity image reconstructions and careful treatment of the uncertainties of both lens mass distribution and source light, which are typically difficult to quantify. In anticipation of future high-resolution datasets, in this work we leverage a range of recent developments in machine learning to develop a new Bayesian strong-lensing image analysis pipeline. Its highlights are: (A) a fast, GPU-enabled, end-to-end differentiable strong-lensing image simulator; (B) a new, statistically principled source model based on a computationally highly efficient approximation to Gaussian processes that also takes into account pixellation; and (C) a scalable variational inference framework that enables simultaneously deriving posteriors for tens of thousands of lens and source parameters and optimising hyperparameters via stochastic gradient descent. Besides efficient and accurate parameter estimation and lens model uncertainty quantification, the main aim of the pipeline is the generation of training data for targeted simulation-based inference of dark matter substructure, which we will exploit in a companion paper.
Stochastic field distortions caused by atmospheric turbulence are a fundamental limitation to the astrometric accuracy of ground-based imaging. This distortion field is measurable at the locations of stars with accurate positions provided by the Gaia DR2 catalog; we develop the use of Gaussian process regression (GPR) to interpolate the distortion field to arbitrary locations in each exposure. We introduce an extension to standard GPR techniques that exploits the knowledge that the 2-dimensional distortion field is curl-free. Applied to several hundred 90-second exposures from the Dark Energy Survey as a testbed, we find that the GPR correction reduces the variance of the turbulent distortions $approx12times$, on average, with better performance in denser regions of the Gaia catalog. The RMS per-coordinate distortion in the $riz$ bands is typically $approx7$ mas before any correction, and $approx2$ mas after application of the GPR model. The GPR astrometric corrections are validated by the observation that their use reduces, from 10 to 5 mas RMS, the residuals to an orbit fit to $riz$-band observations over 5 years of the $r=18.5$ trans-Neptunian object Eris. We also propose a GPR method, not yet implemented, for simultaneously estimating the turbulence fields and the 5-dimensional stellar solutions in a stack of overlapping exposures, which should yield further turbulence reductions in future deep surveys.
This research note presents a derivation and implementation of efficient and scalable gradient computations using the celerite algorithm for Gaussian Process (GP) modeling. The algorithms are derived in a reverse accumulation or backpropagation frame work and they can be easily integrated into existing automatic differentiation frameworks to provide a scalable method for evaluating the gradients of the GP likelihood with respect to all input parameters. The algorithm derived in this note uses less memory and is more efficient th
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا