No Arabic abstract
Markov State Models (MSM) are widely used to elucidate dynamic properties of molecular systems from unbiased Molecular Dynamics (MD). However, the implementation of reweighting schemes for MSMs to analyze biased simulations, for example produced by enhanced sampling techniques, is still at an early stage of development. Several dynamical reweighing approaches have been proposed, which can be classified as approaches based on (i) Kramers rate theory, (ii) rescaling of the probability density flux, (iii) reweighting by formulating a likelihood function, (iv) path reweighting. We present the state-of-the-art and discuss the methodological differences of these methods, their limitations and recent applications.
The sensitivity of molecular dynamics on changes in the potential energy function plays an important role in understanding the dynamics and function of complex molecules.We present a method to obtain path ensemble averages of a perturbed dynamics from a set of paths generated by a reference dynamics. It is based on the concept of path probability measure and the Girsanov theorem, a result from stochastic analysis to estimate a change of measure of a path ensemble. Since Markov state models (MSM) of the molecular dynamics can be formulated as a combined phase-space and path ensemble average, the method can be extended toreweight MSMs by combining it with a reweighting of the Boltzmann distribution. We demonstrate how to efficiently implement the Girsanov reweighting in a molecular dynamics simulation program by calculating parts of the reweighting factor on the fly during the simulation, and we benchmark the method on test systems ranging from a two-dimensional diffusion process to an artificial many-body system and alanine dipeptide and valine dipeptide in implicit and explicit water. The method can be used to study the sensitivity of molecular dynamics on external perturbations as well as to reweight trajectories generated by enhanced sampling schemes to the original dynamics.
Markov state models (MSMs) have been successful in computing metastable states, slow relaxation timescales and associated structural changes, and stationary or kinetic experimental observables of complex molecules from large amounts of molecular dynamics simulation data. However, MSMs approximate the true dynamics by assuming a Markov chain on a clusters discretization of the state space. This approximation is difficult to make for high-dimensional biomolecular systems, and the quality and reproducibility of MSMs has therefore been limited. Here, we discard the assumption that dynamics are Markovian on the discrete clusters. Instead, we only assume that the full phase- space molecular dynamics is Markovian, and a projection of this full dynamics is observed on the discrete states, leading to the concept of Projected Markov Models (PMMs). Robust estimation methods for PMMs are not yet available, but we derive a practically feasible approximation via Hidden Markov Models (HMMs). It is shown how various molecular observables of interest that are often computed from MSMs can be computed from HMMs / PMMs. The new framework is applicable to both, simulation and single-molecule experimental data. We demonstrate its versatility by applications to educative model systems, an 1 ms Anton MD simulation of the BPTI protein, and an optical tweezer force probe trajectory of an RNA hairpin.
Using a set of oscillator strengths and excited-state dipole moments of near full configuration interaction (FCI) quality determined for small compounds, we benchmark the performances of several single-reference wave function methods (CC2, CCSD, CC3, CCSDT, ADC(2), and ADC(3/2)) and time-dependent density-functional theory (TD-DFT) with various functionals (B3LYP, PBE0, M06-2X, CAM-B3LYP, and $omega$B97X-D). We consider the impact of various gauges (length, velocity, and mixed) and formalisms: equation of motion (EOM) emph{vs} linear response (LR), relaxed emph{vs} unrelaxed orbitals, etc. Beyond the expected accuracy improvements and a neat decrease of formalism sensitivy when using higher-order wave function methods, the present contribution shows that, for both ADC(2) and CC2, the choice of gauge impacts more significantly the magnitude of the oscillator strengths than the choice of formalism, and that CCSD yields a notable improvement on this transition property as compared to CC2. For the excited-state dipole moments, switching on orbital relaxation appreciably improves the accuracy of both ADC(2) and CC2, but has a rather small effect at the CCSD level. Going from ground to excited states, the typical errors on dipole moments for a given method tend to roughly triple. Interestingly, the ADC(3/2) oscillator strengths and dipoles are significantly more accurate than their ADC(2) counterparts, whereas the two models do deliver rather similar absolute errors for transition energies. Concerning TD-DFT, one finds: i) a rather negligible impact of the gauge on oscillator strengths for all tested functionals (except for M06-2X); ii) deviations of ca.~0.10 D on ground-state dipoles for all functionals; iii) the better overall performance of CAM-B3LYP for the two considered excited-state properties.
Stochastic gradient MCMC (SG-MCMC) algorithms have proven useful in scaling Bayesian inference to large datasets under an assumption of i.i.d data. We instead develop an SG-MCMC algorithm to learn the parameters of hidden Markov models (HMMs) for time-dependent data. There are two challenges to applying SG-MCMC in this setting: The latent discrete states, and needing to break dependencies when considering minibatches. We consider a marginal likelihood representation of the HMM and propose an algorithm that harnesses the inherent memory decay of the process. We demonstrate the effectiveness of our algorithm on synthetic experiments and an ion channel recording data, with runtimes significantly outperforming batch MCMC.
We provide a pedagogical introduction to the two main variants of real-space quantum Monte Carlo methods for electronic-structure calculations: variational Monte Carlo (VMC) and diffusion Monte Carlo (DMC). Assuming no prior knowledge on the subject, we review in depth the Metropolis-Hastings algorithm used in VMC for sampling the square of an approximate wave function, discussing details important for applications to electronic systems. We also review in detail the more sophisticated DMC algorithm within the fixed-node approximation, introduced to avoid the infamous Fermionic sign problem, which allows one to sample a more accurate approximation to the ground-state wave function. Throughout this review, we discuss the statistical methods used for evaluating expectation values and statistical uncertainties. In particular, we show how to estimate nonlinear functions of expectation values and their statistical uncertainties.