Challenges for Machine Learning Force Fields in Reproducing Potential Energy Surfaces of Flexible Molecules


Abstract in English

Dynamics of flexible molecules are often determined by an interplay between local chemical bond fluctuations and conformational changes driven by long-range electrostatics and van der Waals interactions. This interplay between interactions yields complex potential-energy surfaces (PES) with multiple minima and transition paths between them. In this work, we assess the performance of state-of-the-art Machine Learning (ML) models, namely sGDML, SchNet, GAP/SOAP, and BPNN for reproducing such PES, while using limited amounts of reference data. As a benchmark, we use the cis to trans thermal relaxation in an azobenzene molecule, where at least three different transition mechanisms should be considered. Although GAP/SOAP, SchNet, and sGDML models can globally achieve chemical accuracy of 1 kcal mol-1 with fewer than 1000 training points, predictions greatly depend on the ML method used as well as the local region of the PES being sampled. Within a given ML method, large differences can be found between predictions of close-to-equilibrium and transition regions, as well as for different transition mechanisms. We identify key challenges that the ML models face in learning long-range interactions and the intrinsic limitations of commonly used atom-based descriptors. All in all, our results suggest switching from learning the entire PES within a single model to using multiple local models with optimized descriptors, training sets, and architectures for different parts of complex PES.

Download