No Arabic abstract
In this work we apply and expand on a recently introduced outlier detection algorithm that is based on an unsupervised random forest. We use the algorithm to calculate a similarity measure for stellar spectra from the Apache Point Observatory Galactic Evolution Experiment (APOGEE). We show that the similarity measure traces non-trivial physical properties and contains information about complex structures in the data. We use it for visualization and clustering of the dataset, and discuss its ability to find groups of highly similar objects, including spectroscopic twins. Using the similarity matrix to search the dataset for objects allows us to find objects that are impossible to find using their best fitting model parameters. This includes extreme objects for which the models fail, and rare objects that are outside the scope of the model. We use the similarity measure to detect outliers in the dataset, and find a number of previously unknown Be-type stars, spectroscopic binaries, carbon rich stars, young stars, and a few that we cannot interpret. Our work further demonstrates the potential for scientific discovery when combining machine learning methods with modern survey data.
The vast volume of data generated by modern astronomical surveys offers test beds for the application of machine-learning. It is important to evaluate potential existing tools and determine those that are optimal for extracting scientific knowledge from the available observations. We explore the possibility of using clustering algorithms to separate stellar populations with distinct chemical patterns. Star clusters are likely the most chemically homogeneous populations in the Galaxy, and therefore any practical approach to identifying distinct stellar populations should at least be able to separate clusters from each other. We applied eight clustering algorithms combined with four dimensionality reduction strategies to automatically distinguish stellar clusters using chemical abundances of 13 elements. Our sample includes 18 stellar clusters with a total of 453 stars. We use statistical tests showing that some pairs of clusters are indistinguishable from each other when chemical abundances from the Apache Point Galactic Evolution Experiment (APOGEE) are used. However, for most clusters we are able to automatically assign membership with metric scores similar to previous works. The confusion level of the automatically selected clusters is consistent with statistical tests that demonstrate the impossibility of perfectly distinguishing all the clusters from each other. These statistical tests and confusion levels establish a limit for the prospect of blindly identifying stars born in the same cluster based solely on chemical abundances. We find that some of the algorithms we explored are capable of blindly identify stellar populations with similar ages and chemical distributions in the APOGEE data. Because some stellar clusters are chemically indistinguishable, our study supports the notion of extending weak chemical tagging that involves families of clusters instead of individual clusters
In preparation for future, large-scale, multi-object, high-resolution spectroscopic surveys of the Galaxy, we present a series of tests of the precision in radial velocity and chemical abundances that any such project can achieve at a 4m class telescope. We briefly discuss a number of science cases that aim at studying the chemo-dynamical history of the major Galactic components (bulge, thin and thick disks, and halo) - either as a follow-up to the Gaia mission or on their own merits. Based on a large grid of synthetic spectra that cover the full range in stellar parameters of typical survey targets, we devise an optimal wavelength range and argue for a moderately high-resolution spectrograph. As a result, the kinematic precision is not limited by any of these factors, but will practically only suffer from systematic effects, easily reaching uncertainties <1 km/s. Under realistic survey conditions (namely, considering stars brighter than r=16 mag with reasonable exposure times) we prefer an ideal resolving power of R~20000 on average, for an overall wavelength range (with a common two-arm spectrograph design) of [395;456.5] nm and [587;673] nm. We show for the first time on a general basis that it is possible to measure chemical abundance ratios to better than 0.1 dex for many species (Fe, Mg, Si, Ca, Ti, Na, Al, V, Cr, Mn, Co, Ni, Y, Ba, Nd, Eu) and to an accuracy of about 0.2 dex for other species such as Zr, La, and Sr. While our feasibility study was explicitly carried out for the 4MOST facility, the results can be readily applied to and used for any other conceptual design study for high-resolution spectrographs.
Gas-phase methanol was recently detected in a protoplanetary disk for the first time with ALMA. The peak abundance and distribution of methanol observed in TW Hya differed from that predicted by chemical models. Here, the chemistry of methanol gas and ice is calculated using a physical model tailored for TW Hya with the aim to contrast the results with the recent detection in this source. New pathways for the formation of larger complex molecules (e.g., ethylene glycol) are included in an updated chemical model, as well as the fragmentation of methanol ice upon photodesorption. It is found that including fragmentation upon photodesorption improves the agreement between the peak abundance reached in the chemical models with that observed in TW Hya ($sim 10^{-11}$ with respect to ce{H2}); however, the model predicts that the peak in emission resides a factor of $2-3$ farther out in the disk than the ALMA images. Reasons for the persistent differences in the gas-phase methanol distribution between models and the observations of TW Hya are discussed. These include the location of the ice reservoir which may coincide with the compact mm-dust disk ($lesssim 60$~au) and sources of gas-phase methanol which have not yet been considered in models. The possibility of detecting larger molecules with ALMA is also explored. Calculations of the rotational spectra of complex molecules other than methanol using a parametric model constrained by the TW Hya observations suggest that the detection of individual emission lines of complex molecules with ALMA remains challenging. However, the signal-to-noise ratio can be enhanced via stacking of multiple transitions which have similar upper energy levels.
Over the past 18 months we have revisited the science requirements for a multi-object spectrograph (MOS) for the European Extremely Large Telescope (E-ELT). These efforts span the full range of E-ELT science and include input from a broad cross-section of astronomers across the ESO partner countries. In this contribution we summarise the key cases relating to studies of high-redshift galaxies, galaxy evolution, and stellar populations, with a more expansive presentation of a new case relating to detection of exoplanets in stellar clusters. A general requirement is the need for two observational modes to best exploit the large (>40 sq. arcmin) patrol field of the E-ELT. The first mode (high multiplex) requires integrated-light (or coarsely resolved) optical/near-IR spectroscopy of >100 objects simultaneously. The second (high definition), enabled by wide-field adaptive optics, requires spatially-resolved, near-IR of >10 objects/sub-fields. Within the context of the conceptual study for an ELT-MOS called MOSAIC, we summarise the top-level requirements from each case and introduce the next steps in the design process.
We present a pipeline based on a random forest classifier for the identification of high column-density clouds of neutral hydrogen (i.e. the Lyman limit systems, LLSs) in absorption within large spectroscopic surveys of z>3 quasars. We test the performance of this method on mock quasar spectra that reproduce the expected data quality of the Dark Energy Spectroscopic Instrument (DESI) and the WHT Enhanced Area Velocity Explorer (WEAVE) surveys, finding >90% completeness and purity for N(HI)> 10^17.2 cm^-2 LLSs against quasars of g < 23 mag at z~3.5-3.7. After training and applying our method on 10,000 quasar spectra at z~3.5-4.0 from the Sloan Digital Sky Survey (Data Release 16), we identify ~6600 LLSs with N(HI)>10^17.5 cm^-2 between z~3.1-4.0 with a completeness and purity of >90% for the classification of LLSs. Using this sample, we measure a number of LLSs per unit redshift of 2.32 +/- 0.08 at z=[3.3,3.6]. We also present results on the performance of random forest for the measurement of the LLS redshifts and HI column densities, and for the identification of broad absorption line quasars.