ترغب بنشر مسار تعليمي؟ اضغط هنا

Active Importance Sampling for Variational Objectives Dominated by Rare Events: Consequences for Optimization and Generalization

120   0   0.0 ( 0 )
 نشر من قبل Grant Rotskoff
 تاريخ النشر 2020
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Deep neural networks, when optimized with sufficient data, provide accurate representations of high-dimensional functions; in contrast, function approximation techniques that have predominated in scientific computing do not scale well with dimensionality. As a result, many high-dimensional sampling and approximation problems once thought intractable are being revisited through the lens of machine learning. While the promise of unparalleled accuracy may suggest a renaissance for applications that require parameterizing representations of complex systems, in many applications gathering sufficient data to develop such a representation remains a significant challenge. Here we introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions that are dominated by rare events. We show that importance sampling reduces the asymptotic variance of the solution to a learning problem, suggesting benefits for generalization. We study our algorithm in the context of learning dynamical transition pathways between two states of a system, a problem with applications in statistical physics and implications in machine learning theory. Our numerical experiments demonstrate that we can successfully learn even with the compounding difficulties of high-dimension and rare data.



قيم البحث

اقرأ أيضاً

Active matter represents a broad class of systems that evolve far from equilibrium due to the local injection of energy. Like their passive analogues, transformations between distinct metastable states in active matter proceed through rare fluctuatio ns, however their detailed balance violating dynamics renders these events difficult to study. Here, we present a simulation method for evaluating the rate and mechanism of rare events in generic nonequilibrium systems and apply it to study the conformational changes of a passive solute in an active fluid. The method employs a variational optimization of a control force that renders the rare event a typical one, supplying an exact estimate of its rate as a ratio of path partition functions. Using this method we find that increasing activity in the active bath can enhance the rate of conformational switching of the passive solute in a manner consistent with recent bounds from stochastic thermodynamics.
We present a new method for sampling rare and large fluctuations in a non-equilibrium system governed by a stochastic partial differential equation (SPDE) with additive forcing. To this end, we deploy the so-called instanton formalism that correspond s to a saddle-point approximation of the action in the path integral formulation of the underlying SPDE. The crucial step in our approach is the formulation of an alternative SPDE that incorporates knowledge of the instanton solution such that we are able to constrain the dynamical evolutions around extreme flow configurations only. Finally, a reweighting procedure based on the Girsanov theorem is applied to recover the full distribution function of the original system. The entire procedure is demonstrated on the example of the one-dimensional Burgers equation. Furthermore, we compare our method to conventional direct numerical simulations as well as to Hybrid Monte Carlo methods. It will be shown that the instanton-based sampling method outperforms both approaches and allows for an accurate quantification of the whole probability density function of velocity gradients from the core to the very far tails.
We have studied the distribution of traffic flow $q$ for the Nagel-Schreckenberg model by computer simulations. We applied a large-deviation approach, which allowed us to obtain the distribution $P(q)$ over more than one hundred decades in probabilit y, down to probabilities like $10^{-140}$. This allowed us to characterize the flow distribution over a large range of the support and identify the characteristics of rare and even very rare traffic situations. We observe a change of the distribution shape when increasing the density of cars from the free flow to the congestion phase. Furthermore, we characterize typical and rare traffic situations by measuring correlations of $q$ to other quantities like density of standing cars or number and size of traffic jams.
Reducing the variance of the gradient estimator is known to improve the convergence rate of stochastic gradient-based optimization and sampling algorithms. One way of achieving variance reduction is to design importance sampling strategies. Recently, the problem of designing such schemes was formulated as an online learning problem with bandit feedback, and algorithms with sub-linear static regret were designed. In this work, we build on this framework and propose Avare, a simple and efficient algorithm for adaptive importance sampling for finite-sum optimization and sampling with decreasing step-sizes. Under standard technical conditions, we show that Avare achieves $mathcal{O}(T^{2/3})$ and $mathcal{O}(T^{5/6})$ dynamic regret for SGD and SGLD respectively when run with $mathcal{O}(1/t)$ step sizes. We achieve this dynamic regret bound by leveraging our knowledge of the dynamics defined by the algorithm, and combining ideas from online learning and variance-reduced stochastic optimization. We validate empirically the performance of our algorithm and identify settings in which it leads to significant improvements.
For machine learning models trained with limited labeled training data, validation stands to become the main bottleneck to reducing overall annotation costs. We propose a statistical validation algorithm that accurately estimates the F-score of binar y classifiers for rare categories, where finding relevant examples to evaluate on is particularly challenging. Our key insight is that simultaneous calibration and importance sampling enables accurate estimates even in the low-sample regime (< 300 samples). Critically, we also derive an accurate single-trial estimator of the variance of our method and demonstrate that this estimator is empirically accurate at low sample counts, enabling a practitioner to know how well they can trust a given low-sample estimate. When validating state-of-the-art semi-supervised models on ImageNet and iNaturalist2017, our method achieves the same estimates of model performance with up to 10x fewer labels than competing approaches. In particular, we can estimate model F1 scores with a variance of 0.005 using as few as 100 labels.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا