Beyond the Hubble Sequence -- Exploring Galaxy Morphology with Unsupervised Machine Learning

118 0 0.0 ( 0 )

Download Cite

Added by Cheng Ting-Yun

Publication date 2020

fields Physics

and research's language is English

Authors Ting-Yun Cheng - Marc Huertas-Company - Christopher J. Conselice

Astrophysics of Galaxies

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We explore unsupervised machine learning for galaxy morphology analyses using a combination of feature extraction with a vector-quantised variational autoencoder (VQ-VAE) and hierarchical clustering (HC). We propose a new methodology that includes: (1) consideration of the clustering performance simultaneously when learning features from images; (2) allowing for various distance thresholds within the HC algorithm; (3) using the galaxy orientation to determine the number of clusters. This setup provides 27 clusters created with this unsupervised learning which we show are well separated based on galaxy shape and structure (e.g., Sersic index, concentration, asymmetry, Gini coefficient). These resulting clusters also correlate well with physical properties such as the colour-magnitude diagram, and span the range of scaling-relations such as mass vs. size amongst the different machine-defined clusters. When we merge these multiple clusters into two large preliminary clusters to provide a binary classification, an accuracy of $sim87%$ is reached using an imbalanced dataset, matching real galaxy distributions, which includes 22.7% early-type galaxies and 77.3% late-type galaxies. Comparing the given clusters with classic Hubble types (ellipticals, lenticulars, early spirals, late spirals, and irregulars), we show that there is an intrinsic vagueness in visual classification systems, in particular galaxies with transitional features such as lenticulars and early spirals. Based on this, the main result in this work is not how well our unsupervised method matches visual classifications and physical properties, but that the method provides an independent classification that may be more physically meaningful than any visually based ones.

rate research

Exploring Coronal Heating Using Unsupervised Machine-Learning

322 - Shabbir Bawaji , Ujjaini Alam , Surajit Mondal 2021

The perplexing mystery of what maintains the solar coronal temperature at about a million K, while the visible disc of the Sun is only at 5800 K, has been a long standing problem in solar physics. A recent study by Mondal(2020) has provided the first evidence for the presence of numerous ubiquitous impulsive emissions at low radio frequencies from the quiet sun regions, which could hold the key to solving this mystery. These features occur at rates of about five hundred events per minute, and their strength is only a few percent of the background steady emission. One of the next steps for exploring the feasibility of this resolution to the coronal heating problem is to understand the morphology of these emissions. To meet this objective we have developed a technique based on an unsupervised machine learning approach for characterising the morphology of these impulsive emissions. Here we present the results of application of this technique to over 8000 images spanning 70 minutes of data in which about 34,500 features could robustly be characterised as 2D elliptical Gaussians.

Solar and Stellar Astrophysics Instrumentation and Methods for Astrophysics Machine Learning

Synergies between low- and intermediate-redshift galaxy populations revealed with unsupervised machine learning

98 - Sebastian Turner , Ma{l}gorzata Siudek , Samir Salim 2021

The colour bimodality of galaxies provides an empirical basis for theories of galaxy evolution. However, the balance of processes that begets this bimodality has not yet been constrained. A more detailed view of the galaxy population is needed, which we achieve in this paper by using unsupervised machine learning to combine multi-dimensional data at two different epochs. We aim to understand the cosmic evolution of galaxy subpopulations by uncovering substructures within the colour bimodality. We choose a clustering algorithm that models clusters using only the most discriminative data available, and apply it to two galaxy samples: one from the second edition of the GALEX-SDSS-WISE Legacy Catalogue (GSWLC-2; $z sim 0.06$), and the other from the VIMOS Public Extragalactic Redshift Survey (VIPERS; $z sim 0.65$). We cluster within a nine-dimensional feature space defined purely by rest-frame ultraviolet-through-near-infrared colours. Both samples are similarly partitioned into seven clusters, breaking down into four of mostly star-forming galaxies (including the vast majority of green valley galaxies) and three of mostly passive galaxies. The separation between these two families of clusters suggests differences in the evolution of their galaxies, and that these differences are strongly expressed in their colours alone. The samples are closely related, with star-forming/green-valley clusters at both epochs forming morphological sequences, capturing the gradual internally-driven growth of galaxy bulges. At high stellar masses, this growth is linked with quenching. However, it is only in our low-redshift sample that additional, environmental processes appear to be involved in the evolution of low-mass passive galaxies.

Astrophysics of Galaxies

Galaxy morphological classification in deep-wide surveys via unsupervised machine learning

122 - Garreth Martin , Sugata Kaviraj , Alex Hocking 2019

Galaxy morphology is a fundamental quantity, that is essential not only for the full spectrum of galaxy-evolution studies, but also for a plethora of science in observational cosmology. While a rich literature exists on morphological-classification techniques, the unprecedented data volumes, coupled, in some cases, with the short cadences of forthcoming Big-Data surveys (e.g. from the LSST), present novel challenges for this field. Large data volumes make such datasets intractable for visual inspection (even via massively-distributed platforms like Galaxy Zoo), while short cadences make it difficult to employ techniques like supervised machine-learning, since it may be impractical to repeatedly produce training sets on short timescales. Unsupervised machine learning, which does not require training sets, is ideally suited to the morphological analysis of new and forthcoming surveys. Here, we employ an algorithm that performs clustering of graph representations, in order to group image patches with similar visual properties and objects constructed from those patches, like galaxies. We implement the algorithm on the Hyper-Suprime-Cam Subaru-Strategic-Program Ultra-Deep survey, to autonomously reduce the galaxy population to a small number (160) of morphological clusters, populated by galaxies with similar morphologies, which are then benchmarked using visual inspection. The morphological classifications (which we release publicly) exhibit a high level of purity, and reproduce known trends in key galaxy properties as a function of morphological type at z<1 (e.g. stellar-mass functions, rest-frame colours and the position of galaxies on the star-formation main sequence). Our study demonstrates the power of unsupervised machine learning in performing accurate morphological analysis, which will become indispensable in this new era of deep-wide surveys.

Astrophysics of Galaxies Instrumentation and Methods for Astrophysics

Morphology and kinematics of orbital components in CALIFA galaxies across the Hubble sequence

72 - Ling Zhu , Glenn van de Ven , Jairo Mendez-Abreu 2018

Based on the stellar orbit distribution derived from orbit-superposition Schwarzschild models, we decompose each of 250 representative present-day galaxies into four orbital components: cold with strong rotation, warm with weak rotation, hot with dominant random motion and counter-rotating (CR). We rebuild the surface brightness ($Sigma$) of each orbital component and we present in figures and tables a quantification of their morphologies using the Sersic index textit{n}, concentration $C = log{(Sigma_{0.1R_e}/Sigma_{R_e})}$ and intrinsic flattening $q_{mathrm{Re}}$ and $q_{mathrm{Rmax}}$, with $R_e$ the half-light-radius and $R_{mathrm{max}}$ the CALIFA data coverage. We find that: (1) kinematic hotter components are generally more concentrated and rounder than colder components, and (2) all components become more concentrated and thicker/rounder in more massive galaxies; they change from disk-like in low mass late-type galaxies to bulge-like in high-mass early type galaxies. Our findings suggest that Sersic textit{n} is not a good discriminator between rotating bulges and non-rotating bulges. The luminosity fraction of cold orbits $f_{rm cold}$ is well correlated with the photometrically-decomposed disk fraction $f_{rm disk}$ as $f_{mathrm{cold}} = 0.14 + 0.23f_{mathrm{mathrm{disk}}}$. Similarly, the hot orbit fraction $f_{rm hot}$ is correlated with the bulge fraction $f_{rm bulge}$ as $f_{mathrm{hot}} = 0.19 + 0.31f_{mathrm{mathrm{bulge}}}$. The warm orbits mainly contribute to disks in low-mass late-type galaxies, and to bulges in high-mass early-type galaxies. The cold, warm, and hot components generally follow the same morphology ($epsilon = 1-q_{rm Rmax}$) versus kinematics ($sigma_z^2/overline{V_{mathrm{tot}}^2}$) relation as the thin disk, thick disk/pseudo bulge, and classical bulge identified from cosmological simulations.

Astrophysics of Galaxies

The Hubble Sequence at $zsim0$ in the IllustrisTNG simulation with deep learning

64 - M. Huertas-Company , V. Rodriguez-Gomez , D. Nelson 2019

We analyze the optical morphologies of galaxies in the IllustrisTNG simulation at $zsim0$ with a Convolutional Neural Network trained on visual morphologies in the Sloan Digital Sky Survey. We generate mock SDSS images of a mass complete sample of $sim12,000$ galaxies in the simulation using the radiative transfer code SKIRT and include PSF and noise to match the SDSS r-band properties. The images are then processed through the exact same neural network used to estimate SDSS morphologies to classify simulated galaxies in four morphological classes (E, S0/a, Sab, Scd). The CNN model finds that $sim95%$ of the simulated galaxies fall in one the four main classes with high confidence. The mass-size relations of the simulated galaxies divided by morphological type also reproduce well the slope and the normalization of observed relations which confirms the realism of optical morphologies in the TNG suite. However, the Stellar Mass Functions decomposed into different morphologies still show significant discrepancies with observations both at the low and high mass end. We find that the high mass end of the SMF is dominated in TNG by massive disk galaxies while early-type galaxies dominate in the observations according to the CNN classifications. The present work highlights the importance of detailed comparisons between observations and simulations in comparable conditions.

Astrophysics of Galaxies