ترغب بنشر مسار تعليمي؟ اضغط هنا

Radio Galaxy Zoo: Knowledge Transfer Using Rotationally Invariant Self-Organising Maps

97   0   0.0 ( 0 )
 نشر من قبل Tim Galvin
 تاريخ النشر 2019
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

With the advent of large scale surveys the manual analysis and classification of individual radio source morphologies is rendered impossible as existing approaches do not scale. The analysis of complex morphological features in the spatial domain is a particularly important task. Here we discuss the challenges of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project and introduce a proper transfer mechanism via quantile random forest regression. By using parallelized rotation and flipping invariant Kohonen-maps, image cubes of Radio Galaxy Zoo selected galaxies formed from the FIRST radio continuum and WISE infrared all sky surveys are first projected down to a two-dimensional embedding in an unsupervised way. This embedding can be seen as a discretised space of shapes with the coordinates reflecting morphological features as expressed by the automatically derived prototypes. We find that these prototypes have reconstructed physically meaningful processes across two channel images at radio and infrared wavelengths in an unsupervised manner. In the second step, images are compared with those prototypes to create a heat-map, which is the morphological fingerprint of each object and the basis for transferring the user generated labels. These heat-maps have reduced the feature space by a factor of 248 and are able to be used as the basis for subsequent ML methods. Using an ensemble of decision trees we achieve upwards of 85.7% and 80.7% accuracy when predicting the number of components and peaks in an image, respectively, using these heat-maps. We also question the currently used discrete classification schema and introduce a continuous scale that better reflects the uncertainty in transition between two classes, caused by sensitivity and resolution limits.



قيم البحث

اقرأ أيضاً

Accurate photometric redshift calibration is central to the robustness of all cosmology constraints from cosmic shear surveys. Analyses of the KiDS re-weighted training samples from all overlapping spectroscopic surveys to provide a direct redshift c alibration. Using self-organising maps (SOMs) we demonstrate that this spectroscopic compilation is sufficiently complete for KiDS, representing $99%$ of the effective 2D cosmic shear sample. We use the SOM to define a $100%$ represented `gold cosmic shear sample, per tomographic bin. Using mock simulations of KiDS and the spectroscopic training set, we estimate the uncertainty on the SOM redshift calibration, and find that photometric noise, sample variance, and spectroscopic selection effects (including redshift and magnitude incompleteness) induce a combined maximal scatter on the bias of the redshift distribution reconstruction ($Delta langle z rangle=langle z rangle_{rm est}-langle z rangle_{rm true}$) of $sigma_{Delta langle z rangle} leq 0.006$ in all tomographic bins. We show that the SOM calibration is unbiased in the cases of noiseless photometry and perfectly representative spectroscopic datasets, as expected from theory. The inclusion of both photometric noise and spectroscopic selection effects in our mock data introduces a maximal bias of $Delta langle z rangle =0.013pm0.006$, or $Delta langle z rangle leq 0.025$ at $97.5%$ confidence, once quality flags have been applied to the SOM. The method presented here represents a significant improvement over the previously adopted direct redshift calibration implementation for KiDS, owing to its diagnostic and quality assurance capabilities. The implementation of this method in future cosmic shear studies will allow better diagnosis, examination, and mitigation of systematic biases in photometric redshift calibration.
Some argue that biologically inspired algorithms are the future of solving difficult problems in computer science. Others strongly believe that the future lies in the exploration of mathematical foundations of problems at hand. The field of computer security tends to accept the latter view as a more appropriate approach due to its more workable validation and verification possibilities. The lack of rigorous scientific practices prevalent in biologically inspired security research does not aid in presenting bio-inspired security approaches as a viable way of dealing with complex security problems. This chapter introduces a biologically inspired algorithm, called the Self Organising Map (SOM), that was developed by Teuvo Kohonen in 1981. Since the algorithms inception it has been scrutinised by the scientific community and analysed in more than 4000 research papers, many of which dealt with various computer security issues, from anomaly detection, analysis of executables all the way to wireless network monitoring. In this chapter a review of security related SOM research undertaken in the past is presented and analysed. The algorithms biological analogies are detailed and the authors view on the future possibilities of this successful bio-inspired approach are given. The SOM algorithms close relation to a number of vital functions of the human brain and the emergence of multi-core computer architectures are the two main reasons behind our assumption that the future of the SOM algorithm and its variations is promising, notably in the field of computer security.
We present a new method for the mitigation of observational systematic effects in angular galaxy clustering via corrective random galaxy catalogues. Real and synthetic galaxy data, from the Kilo Degree Surveys (KiDS) 4$^{rm{th}}$ Data Release (KiDS-$ 1000$) and the Full-sky Lognormal Astro-fields Simulation Kit (FLASK) package respectively, are used to train self-organising maps (SOMs) to learn the multivariate relationships between observed galaxy number density and up to six systematic-tracer variables, including seeing, Galactic dust extinction, and Galactic stellar density. We then create `organised randoms, i.e. random galaxy catalogues with spatially variable number densities, mimicking the learnt systematic density modes in the data. Using realistically biased mock data, we show that these organised randoms consistently subtract spurious density modes from the two-point angular correlation function $w(vartheta)$, correcting biases of up to $12sigma$ in the mean clustering amplitude to as low as $0.1sigma$, over a high signal-to-noise angular range of 7-100 arcmin. Their performance is also validated for angular clustering cross-correlations in a bright, flux-limited subset of KiDS-$1000$, comparing against an analogous sample constructed from highly-complete spectroscopic redshift data. Each organised random catalogue object is a `clone carrying the properties of a real galaxy, and is distributed throughout the survey footprint according to the parent galaxys position in systematics-space. Thus, sub-sample randoms are readily derived from a single master random catalogue via the same selection as applied to the real galaxies. Our method is expected to improve in performance with increased survey area, galaxy number density, and systematic contamination, making organised randoms extremely promising for current and future clustering analyses of faint samples.
We consider the problem of determining the host galaxies of radio sources by cross-identification. This has traditionally been done manually, which will be intractable for wide-area radio surveys like the Evolutionary Map of the Universe (EMU). Autom ated cross-identification will be critical for these future surveys, and machine learning may provide the tools to develop such methods. We apply a standard approach from computer vision to cross-identification, introducing one possible way of automating this problem, and explore the pros and cons of this approach. We apply our method to the 1.4 GHz Australian Telescope Large Area Survey (ATLAS) observations of the Chandra Deep Field South (CDFS) and the ESO Large Area ISO Survey South 1 (ELAIS-S1) fields by cross-identifying them with the Spitzer Wide-area Infrared Extragalactic (SWIRE) survey. We train our method with two sets of data: expert cross-identifications of CDFS from the initial ATLAS data release and crowdsourced cross-identifications of CDFS from Radio Galaxy Zoo. We found that a simple strategy of cross-identifying a radio component with the nearest galaxy performs comparably to our more complex methods, though our estimated best-case performance is near 100 per cent. ATLAS contains 87 complex radio sources that have been cross-identified by experts, so there are not enough complex examples to learn how to cross-identify them accurately. Much larger datasets are therefore required for training methods like ours. We also show that training our method on Radio Galaxy Zoo cross-identifications gives comparable results to training on expert cross-identifications, demonstrating the value of crowdsourced training data.
We apply the technique of self-organising maps (Kohonen 1990) to the automated classification of singly periodic astronomical lightcurves. We find that our maps readily distinguish between lightcurve types in both synthetic and real datasets, and tha t the resulting maps do not depend sensitively on the chosen learning parameters. Automated data analysis techniques are likely to be become increasingly important as the size of astronomical datasets continues to increase, particularly with the advent of ultra-wide-field survey telescopes such as WASP, RAPTOR and ASAS.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا