ترغب بنشر مسار تعليمي؟ اضغط هنا

Variational Monte Carlo (VMC) is an approach for computing ground-state wavefunctions that has recently become more powerful due to the introduction of neural network-based wavefunction parametrizations. However, efficiently training neural wavefunct ions to converge to an energy minimum remains a difficult problem. In this work, we analyze optimization and sampling methods used in VMC and introduce alterations to improve their performance. First, based on theoretical convergence analysis in a noiseless setting, we motivate a new optimizer that we call the Rayleigh-Gauss-Newton method, which can improve upon gradient descent and natural gradient descent to achieve superlinear convergence with little added computational cost. Second, in order to realize this favorable comparison in the presence of stochastic noise, we analyze the effect of sampling error on VMC parameter updates and experimentally demonstrate that it can be reduced by the parallel tempering method. In particular, we demonstrate that RGN can be made robust to energy spikes that occur when new regions of configuration space become available to the sampler over the course of optimization. Finally, putting theory into practice, we apply our enhanced optimization and sampling methods to the transverse-field Ising and XXZ models on large lattices, yielding ground-state energy estimates with remarkably high accuracy after just 200-500 parameter updates.
We introduce an ensemble Markov chain Monte Carlo approach to sampling from a probability density with known likelihood. This method upgrades an underlying Markov chain by allowing an ensemble of such chains to interact via a process in which one cha ins state is cloned as anothers is deleted. This effective teleportation of states can overcome issues of metastability in the underlying chain, as the scheme enjoys rapid mixing once the modes of the target density have been populated. We derive a mean-field limit for the evolution of the ensemble. We analyze the global and local convergence of this mean-field limit, showing asymptotic convergence independent of the spectral gap of the underlying Markov chain, and moreover we interpret the limiting evolution as a gradient flow. We explain how interaction can be applied selectively to a subset of state variables in order to maintain advantage on very high-dimensional problems. Finally we present the application of our methodology to Bayesian hyperparameter estimation for Gaussian process regression.
The extended Lagrangian molecular dynamics (XLMD) method provides a useful framework for reducing the computational cost of a class of molecular dynamics simulations with constrained latent variables. The XLMD method relaxes the constraints by introd ucing a fictitious mass $varepsilon$ for the latent variables, solving a set of singularly perturbed ordinary differential equations. While favorable numerical performance of XLMD has been demonstrated in several different contexts in the past decade, mathematical analysis of the method remains scarce. We propose the first error analysis of the XLMD method in the context of a classical polarizable force field model. While the dynamics with respect to the atomic degrees of freedom are general and nonlinear, the key mathematical simplification of the polarizable force field model is that the constraints on the latent variables are given by a linear system of equations. We prove that when the initial value of the latent variables is compatible in a sense that we define, XLMD converges as the fictitious mass $varepsilon$ is made small with $mathcal{O}(varepsilon)$ error for the atomic degrees of freedom and with $mathcal{O}(sqrt{varepsilon})$ error for the latent variables, when the dimension of the latent variable $d$ is 1. Furthermore, when the initial value of the latent variables is improved to be optimally compatible in a certain sense, we prove that the convergence rate can be improved to $mathcal{O}(varepsilon)$ for the latent variables as well. Numerical results verify that both estimates are sharp not only for $d =1$, but also for arbitrary $d$. In the setting of general $d$, we do obtain convergence, but with the non-sharp rate of $mathcal{O}(sqrt{varepsilon})$ for both the atomic and latent variables.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا