No Arabic abstract
Accurate photometric redshift calibration is central to the robustness of all cosmology constraints from cosmic shear surveys. Analyses of the KiDS re-weighted training samples from all overlapping spectroscopic surveys to provide a direct redshift calibration. Using self-organising maps (SOMs) we demonstrate that this spectroscopic compilation is sufficiently complete for KiDS, representing $99%$ of the effective 2D cosmic shear sample. We use the SOM to define a $100%$ represented `gold cosmic shear sample, per tomographic bin. Using mock simulations of KiDS and the spectroscopic training set, we estimate the uncertainty on the SOM redshift calibration, and find that photometric noise, sample variance, and spectroscopic selection effects (including redshift and magnitude incompleteness) induce a combined maximal scatter on the bias of the redshift distribution reconstruction ($Delta langle z rangle=langle z rangle_{rm est}-langle z rangle_{rm true}$) of $sigma_{Delta langle z rangle} leq 0.006$ in all tomographic bins. We show that the SOM calibration is unbiased in the cases of noiseless photometry and perfectly representative spectroscopic datasets, as expected from theory. The inclusion of both photometric noise and spectroscopic selection effects in our mock data introduces a maximal bias of $Delta langle z rangle =0.013pm0.006$, or $Delta langle z rangle leq 0.025$ at $97.5%$ confidence, once quality flags have been applied to the SOM. The method presented here represents a significant improvement over the previously adopted direct redshift calibration implementation for KiDS, owing to its diagnostic and quality assurance capabilities. The implementation of this method in future cosmic shear studies will allow better diagnosis, examination, and mitigation of systematic biases in photometric redshift calibration.
Some argue that biologically inspired algorithms are the future of solving difficult problems in computer science. Others strongly believe that the future lies in the exploration of mathematical foundations of problems at hand. The field of computer security tends to accept the latter view as a more appropriate approach due to its more workable validation and verification possibilities. The lack of rigorous scientific practices prevalent in biologically inspired security research does not aid in presenting bio-inspired security approaches as a viable way of dealing with complex security problems. This chapter introduces a biologically inspired algorithm, called the Self Organising Map (SOM), that was developed by Teuvo Kohonen in 1981. Since the algorithms inception it has been scrutinised by the scientific community and analysed in more than 4000 research papers, many of which dealt with various computer security issues, from anomaly detection, analysis of executables all the way to wireless network monitoring. In this chapter a review of security related SOM research undertaken in the past is presented and analysed. The algorithms biological analogies are detailed and the authors view on the future possibilities of this successful bio-inspired approach are given. The SOM algorithms close relation to a number of vital functions of the human brain and the emergence of multi-core computer architectures are the two main reasons behind our assumption that the future of the SOM algorithm and its variations is promising, notably in the field of computer security.
We present a new method for the mitigation of observational systematic effects in angular galaxy clustering via corrective random galaxy catalogues. Real and synthetic galaxy data, from the Kilo Degree Surveys (KiDS) 4$^{rm{th}}$ Data Release (KiDS-$1000$) and the Full-sky Lognormal Astro-fields Simulation Kit (FLASK) package respectively, are used to train self-organising maps (SOMs) to learn the multivariate relationships between observed galaxy number density and up to six systematic-tracer variables, including seeing, Galactic dust extinction, and Galactic stellar density. We then create `organised randoms, i.e. random galaxy catalogues with spatially variable number densities, mimicking the learnt systematic density modes in the data. Using realistically biased mock data, we show that these organised randoms consistently subtract spurious density modes from the two-point angular correlation function $w(vartheta)$, correcting biases of up to $12sigma$ in the mean clustering amplitude to as low as $0.1sigma$, over a high signal-to-noise angular range of 7-100 arcmin. Their performance is also validated for angular clustering cross-correlations in a bright, flux-limited subset of KiDS-$1000$, comparing against an analogous sample constructed from highly-complete spectroscopic redshift data. Each organised random catalogue object is a `clone carrying the properties of a real galaxy, and is distributed throughout the survey footprint according to the parent galaxys position in systematics-space. Thus, sub-sample randoms are readily derived from a single master random catalogue via the same selection as applied to the real galaxies. Our method is expected to improve in performance with increased survey area, galaxy number density, and systematic contamination, making organised randoms extremely promising for current and future clustering analyses of faint samples.
We use a dense redshift survey in the foreground of the Subaru GTO2deg^2 weak lensing field (centered at $alpha_{2000}$ = 16$^h04^m44^s$;$delta_{2000}$ =43^circ11^{prime}24^{primeprime}$) to assess the completeness and comment on the purity of massive halo identification in the weak lensing map. The redshift survey (published here) includes 4541 galaxies; 4405 are new redshifts measured with the Hectospec on the MMT. Among the weak lensing peaks with a signal-to-noise greater that 4.25, 2/3 correspond to individual massive systems; this result is essentially identical to the Geller et al. (2010) test of the Deep Lens Survey field F2. The Subaru map, based on images in substantially better seeing than the DLS, enables detection of less massive halos at fixed redshift as expected. We demonstrate that the procedure adopted by Miyazaki et al. (2007) for removing some contaminated peaks from the weak lensing map improves agreement between the lensing map and the redshift survey in the identification of candidate massive systems.
A crucial step in planet hunting surveys is to select the best candidates for follow up observations, given limited telescope resources. This is often performed by human `eyeballing, a time consuming and statistically awkward process. Here we present a new, fast machine learning technique to separate true planet signals from astrophysical false positives. We use Self Organising Maps (SOMs) to study the transit shapes of emph{Kepler} and emph{K2} known and candidate planets. We find that SOMs are capable of distinguishing known planets from known false positives with a success rate of 87.0%, using the transit shape alone. Furthermore, they do not require any candidates to be dispositioned prior to use, meaning that they can be used early in a missions lifetime. A method for classifying candidates using a SOM is developed, and applied to previously unclassified members of the emph{Kepler} KOI list as well as candidates from the emph{K2} mission. The method is extremely fast, taking minutes to run the entire KOI list on a typical laptop. We make texttt{Python} code for performing classifications publicly available, using either new SOMs or those created in this work. The SOM technique represents a novel method for ranking planetary candidate lists, and can be used both alone or as part of a larger autovetting code.
Calibrating the photometric redshifts of >10^9 galaxies for upcoming weak lensing cosmology experiments is a major challenge for the astrophysics community. The path to obtaining the required spectroscopic redshifts for training and calibration is daunting, given the anticipated depths of the surveys and the difficulty in obtaining secure redshifts for some faint galaxy populations. Here we present an analysis of the problem based on the self-organizing map, a method of mapping the distribution of data in a high-dimensional space and projecting it onto a lower-dimensional representation. We apply this method to existing photometric data from the COSMOS survey selected to approximate the anticipated Euclid weak lensing sample, enabling us to robustly map the empirical distribution of galaxies in the multidimensional color space defined by the expected Euclid filters. Mapping this multicolor distribution lets us determine where - in galaxy color space - redshifts from current spectroscopic surveys exist and where they are systematically missing. Crucially, the method lets us determine whether a spectroscopic training sample is representative of the full photometric space occupied by the galaxies in a survey. We explore optimal sampling techniques and estimate the additional spectroscopy needed to map out the color-redshift relation, finding that sampling the galaxy distribution in color space in a systematic way can efficiently meet the calibration requirements. While the analysis presented here focuses on the Euclid survey, similar analysis can be applied to other surveys facing the same calibration challenge, such as DES, LSST, and WFIRST.