No Arabic abstract
A popular way to accelerate the sampling of rare events in molecular dynamics simulations is to introduce a potential that increases the fluctuations of selected collective variables. For this strategy to be successful, it is critical to choose appropriate variables. Here we review some recent developments in the data-driven design of collective variables, with a focus on the combination of Fishers discriminant analysis and neural networks. This approach allows to compress the fluctuations of metastable states into a low-dimensional representation. We illustrate through several examples the effectiveness of this method in accelerating the sampling, while also identifying the physical descriptors that undergo the most significant changes in the process.
Sampling complex free energy surfaces is one of the main challenges of modern atomistic simulation methods. The presence of kinetic bottlenecks in such surfaces often renders a direct approach useless. A popular strategy is to identify a small number of key collective variables and to introduce a bias potential that is able to favor their fluctuations in order to accelerate sampling. Here we propose to use machine learning techniques in conjunction with the recent variationally enhanced sampling method [Valsson and Parrinello, Physical Review Letters 2014] to determine such potential. This is achieved by expressing the bias as a neural network. The parameters are determined in a reinforcement learning scheme aimed at minimizing the variationally enhanced sampling functional. This required the development of a new and more efficient minimization technique. The expressivity of neural networks allows accelerating sampling in systems with rapidly varying free energy surfaces, removing boundary effects artifacts, and making one more step towards being able to handle several collective variables.
Designing an appropriate set of collective variables is crucial to the success of several enhanced sampling methods. Here we focus on how to obtain such variables from information limited to the metastable states. We characterize these states by a large set of descriptors and employ neural networks to compress this information in a lower-dimensional space, using Fishers linear discriminant as an objective function to maximize the discriminative power of the network. We test this method on alanine dipeptide, using the non-linearly separable dataset composed by atomic distances. We then study an intermolecular aldol reaction characterized by a concerted mechanism. The resulting variables are able to promote sampling by drawing non-linear paths in the physical space connecting the fluctuations between metastable basins. Lastly, we interpret the behavior of the neural network by studying its relation to the physical variables. Through the identification of its most relevant features, we are able to gain chemical insight into the process.
We propose a rigorous construction of a 1D path collective variable to sample structural phase transformations in condensed matter. The path collective variable is defined in a space spanned by global collective variables that serve as classifiers derived from local structural units. A reliable identification of local structural environments is achieved by employing a neural network based classification. The 1D path collective variable is subsequently used together with enhanced sampling techniques to explore the complex migration of a phase boundary during a solid-solid phase transformation in molybdenum.
Computing accurate reaction rates is a central challenge in computational chemistry and biology because of the high cost of free energy estimation with unbiased molecular dynamics. In this work, a data-driven machine learning algorithm is devised to learn collective variables with a multitask neural network, where a common upstream part reduces the high dimensionality of atomic configurations to a low dimensional latent space, and separate downstream parts map the latent space to predictions of basin class labels and potential energies. The resulting latent space is shown to be an effective low-dimensional representation, capturing the reaction progress and guiding effective umbrella sampling to obtain accurate free energy landscapes. This approach is successfully applied to model systems including a 5D Muller Brown model, a 5D three-well model, and alanine dipeptide in vacuum. This approach enables automated dimensionality reduction for energy controlled reactions in complex systems, offers a unified framework that can be trained with limited data, and outperforms single-task learning approaches, including autoencoders.
The computational study of conformational transitions in RNA and proteins with atomistic molecular dynamics often requires suitable enhanced sampling techniques. We here introduce a novel method where concurrent metadynamics are integrated in a Hamiltonian replica-exchange scheme. The ladder of replicas is built with different strength of the bias potential exploiting the tunability of well-tempered metadynamics. Using this method, free-energy barriers of individual collective variables are significantly reduced compared with simple force-field scaling. The introduced methodology is flexible and allows adaptive bias potentials to be self-consistently constructed for a large number of simple collective variables, such as distances and dihedral angles. The method is tested on alanine dipeptide and applied to the difficult problem of conformational sampling in a tetranucleotide.