ترغب بنشر مسار تعليمي؟ اضغط هنا

Ensemble learning and iterative training (ELIT) machine learning: applications towards uncertainty quantification and automated experiment in atom-resolved microscopy

132   0   0.0 ( 0 )
 نشر من قبل Maxim Ziatdinov
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

Deep learning has emerged as a technique of choice for rapid feature extraction across imaging disciplines, allowing rapid conversion of the data streams to spatial or spatiotemporal arrays of features of interest. However, applications of deep learning in experimental domains are often limited by the out-of-distribution drift between the experiments, where the network trained for one set of imaging conditions becomes sub-optimal for different ones. This limitation is particularly stringent in the quest to have an automated experiment setting, where retraining or transfer learning becomes impractical due to the need for human intervention and associated latencies. Here we explore the reproducibility of deep learning for feature extraction in atom-resolved electron microscopy and introduce workflows based on ensemble learning and iterative training to greatly improve feature detection. This approach both allows incorporating uncertainty quantification into the deep learning analysis and also enables rapid automated experimental workflows where retraining of the network to compensate for out-of-distribution drift due to subtle change in imaging conditions is substituted for a human operator or programmatic selection of networks from the ensemble. This methodology can be further applied to machine learning workflows in other imaging areas including optical and chemical imaging.

قيم البحث

اقرأ أيضاً

98 - Dimitri Bourilkov 2019
The many ways in which machine and deep learning are transforming the analysis and simulation of data in particle physics are reviewed. The main methods based on boosted decision trees and various types of neural networks are introduced, and cutting- edge applications in the experimental and theoretical/phenomenological domains are highlighted. After describing the challenges in the application of these novel analysis techniques, the review concludes by discussing the interactions between physics and machine learning as a two-way street enriching both disciplines and helping to meet the present and future challenges of data-intensive science at the energy and intensity frontiers.
Rapidly applying the effects of detector response to physics objects (e.g. electrons, muons, showers of particles) is essential in high energy physics. Currently available tools for the transformation from truth-level physics objects to reconstructed detector-level physics objects involve manually defining resolution functions. These resolution functions are typically derived in bins of variables that are correlated with the resolution (e.g. pseudorapidity and transverse momentum). This process is time consuming, requires manual updates when detector conditions change, and can miss important correlations. Machine learning offers a way to automate the process of building these truth-to-reconstructed object transformations and can capture complex correlation for any given set of input variables. Such machine learning algorithms, with sufficient optimization, could have a wide range of applications: improving phenomenological studies by using a better detector representation, allowing for more efficient production of Geant4 simulation by only simulating events within an interesting part of phase space, and studies on future experimental sensitivity to new physics.
The appeal of serverless (FaaS) has triggered a growing interest on how to use it in data-intensive applications such as ETL, query processing, or machine learning (ML). Several systems exist for training large-scale ML models on top of serverless in frastructures (e.g., AWS Lambda) but with inconclusive results in terms of their performance and relative advantage over serverful infrastructures (IaaS). In this paper we present a systematic, comparative study of distributed ML training over FaaS and IaaS. We present a design space covering design choices such as optimization algorithms and synchronization protocols, and implement a platform, LambdaML, that enables a fair comparison between FaaS and IaaS. We present experimental results using LambdaML, and further develop an analytic model to capture cost/performance tradeoffs that must be considered when opting for a serverless infrastructure. Our results indicate that ML training pays off in serverless only for models with efficient (i.e., reduced) communication and that quickly converge. In general, FaaS can be much faster but it is never significantly cheaper than IaaS.
Data-driven prediction and physics-agnostic machine-learning methods have attracted increased interest in recent years achieving forecast horizons going well beyond those to be expected for chaotic dynamical systems. In a separate strand of research data-assimilation has been successfully used to optimally combine forecast models and their inherent uncertainty with incoming noisy observations. The key idea in our work here is to achieve increased forecast capabilities by judiciously combining machine-learning algorithms and data assimilation. We combine the physics-agnostic data-driven approach of random feature maps as a forecast model within an ensemble Kalman filter data assimilation procedure. The machine-learning model is learned sequentially by incorporating incoming noisy observations. We show that the obtained forecast model has remarkably good forecast skill while being computationally cheap once trained. Going beyond the task of forecasting, we show that our method can be used to generate reliable ensembles for probabilistic forecasting as well as to learn effective model closure in multi-scale systems.
A number of scientific competitions have been organised in the last few years with the objective of discovering innovative techniques to perform typical High Energy Physics tasks, like event reconstruction, classification and new physics discovery. F our of these competitions are summarised in this chapter, from which guidelines on organising such events are derived. In addition, a choice of competition platforms and available datasets are described

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا