ترغب بنشر مسار تعليمي؟ اضغط هنا

Optimizing exoplanet atmosphere retrieval using unsupervised machine-learning classification

74   0   0.0 ( 0 )
 نشر من قبل Joshua Hayes
 تاريخ النشر 2019
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

One of the principal bottlenecks to atmosphere characterisation in the era of all-sky surveys is the availability of fast, autonomous and robust atmospheric retrieval methods. We present a new approach using unsupervised machine learning to generate informed priors for retrieval of exoplanetary atmosphere parameters from transmission spectra. We use principal component analysis (PCA) to efficiently compress the information content of a library of transmission spectra forward models generated using the PLATON package. We then apply a $k$-means clustering algorithm in PCA space to segregate the library into discrete classes. We show that our classifier is almost always able to instantaneously place a previously unseen spectrum into the correct class, for low-to-moderate spectral resolutions, $R$, in the range $R~=~30-300$ and noise levels up to $10$~per~cent of the peak-to-trough spectrum amplitude. The distribution of physical parameters for all members of the class therefore provides an informed prior for standard retrieval methods such as nested sampling. We benchmark our informed-prior approach against a standard uniform-prior nested sampler, finding that our approach is up to a factor two faster, with negligible reduction in accuracy. We demonstrate the application of this method to existing and near-future observatories, and show that it is suitable for real-world application. Our general approach is not specific to transmission spectroscopy and should be more widely applicable to cases that involve repetitive fitting of trusted high-dimensional models to large data catalogues, including beyond exoplanetary science.



قيم البحث

اقرأ أيضاً

We introduce a new machine learning based technique to detect exoplanets using the transit method. Machine learning and deep learning techniques have proven to be broadly applicable in various scientific research areas. We aim to exploit some of thes e methods to improve the conventional algorithm based approaches presently used in astrophysics to detect exoplanets. Using the time-series analysis library TSFresh to analyse light curves, we extracted 789 features from each curve, which capture the information about the characteristics of a light curve. We then used these features to train a gradient boosting classifier using the machine learning tool lightgbm. This approach was tested on simulated data, which showed that is more effective than the conventional box least squares fitting (BLS) method. We further found that our method produced comparable results to existing state-of-the-art deep learning models, while being much more computationally efficient and without needing folded and secondary views of the light curves. For Kepler data, the method is able to predict a planet with an AUC of 0.948, so that 94.8 per cent of the true planet signals are ranked higher than non-planet signals. The resulting recall is 0.96, so that 96 per cent of real planets are classified as planets. For the Transiting Exoplanet Survey Satellite (TESS) data, we found our method can classify light curves with an accuracy of 0.98, and is able to identify planets with a recall of 0.82 at a precision of 0.63.
The perplexing mystery of what maintains the solar coronal temperature at about a million K, while the visible disc of the Sun is only at 5800 K, has been a long standing problem in solar physics. A recent study by Mondal(2020) has provided the first evidence for the presence of numerous ubiquitous impulsive emissions at low radio frequencies from the quiet sun regions, which could hold the key to solving this mystery. These features occur at rates of about five hundred events per minute, and their strength is only a few percent of the background steady emission. One of the next steps for exploring the feasibility of this resolution to the coronal heating problem is to understand the morphology of these emissions. To meet this objective we have developed a technique based on an unsupervised machine learning approach for characterising the morphology of these impulsive emissions. Here we present the results of application of this technique to over 8000 images spanning 70 minutes of data in which about 34,500 features could robustly be characterised as 2D elliptical Gaussians.
Galaxy morphology is a fundamental quantity, that is essential not only for the full spectrum of galaxy-evolution studies, but also for a plethora of science in observational cosmology. While a rich literature exists on morphological-classification t echniques, the unprecedented data volumes, coupled, in some cases, with the short cadences of forthcoming Big-Data surveys (e.g. from the LSST), present novel challenges for this field. Large data volumes make such datasets intractable for visual inspection (even via massively-distributed platforms like Galaxy Zoo), while short cadences make it difficult to employ techniques like supervised machine-learning, since it may be impractical to repeatedly produce training sets on short timescales. Unsupervised machine learning, which does not require training sets, is ideally suited to the morphological analysis of new and forthcoming surveys. Here, we employ an algorithm that performs clustering of graph representations, in order to group image patches with similar visual properties and objects constructed from those patches, like galaxies. We implement the algorithm on the Hyper-Suprime-Cam Subaru-Strategic-Program Ultra-Deep survey, to autonomously reduce the galaxy population to a small number (160) of morphological clusters, populated by galaxies with similar morphologies, which are then benchmarked using visual inspection. The morphological classifications (which we release publicly) exhibit a high level of purity, and reproduce known trends in key galaxy properties as a function of morphological type at z<1 (e.g. stellar-mass functions, rest-frame colours and the position of galaxies on the star-formation main sequence). Our study demonstrates the power of unsupervised machine learning in performing accurate morphological analysis, which will become indispensable in this new era of deep-wide surveys.
A machine learning technique with two-dimension convolutional neural network is proposed for detecting exoplanet transits. To test this new method, five different types of deep learning models with or without folding are constructed and studied. The light curves of the Kepler Data Release 25 are employed as the input of these models. The accuracy, reliability, and completeness are determined and their performances are compared. These results indicate that a combination of two-dimension convolutional neural network with folding would be an excellent choice for the future transit analysis.
Crater ellipticity determination is a complex and time consuming task that so far has evaded successful automation. We train a state of the art computer vision algorithm to identify craters in Lunar digital elevation maps and retrieve their sizes and 2D shapes. The computational backbone of the model is MaskRCNN, an instance segmentation general framework that detects craters in an image while simultaneously producing a mask for each crater that traces its outer rim. Our post-processing pipeline then finds the closest fitting ellipse to these masks, allowing us to retrieve the crater ellipticities. Our model is able to correctly identify 87% of known craters in the longitude range we hid from the network during training and validation (test set), while predicting thousands of additional craters not present in our training data. Manual validation of a subset of these new craters indicates that a majority of them are real, which we take as an indicator of the strength of our model in learning to identify craters, despite incomplete training data. The crater size, ellipticity, and depth distributions predicted by our model are consistent with human-generated results. The model allows us to perform a large scale search for differences in crater diameter and shape distributions between the lunar highlands and maria, and we exclude any such differences with a high statistical significance. The predicted test set catalogue and trained model are available here: https://github.com/malidib/Craters_MaskRCNN/.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا