Including prior knowledge is important for effective machine learning models in physics, and is usually achieved by explicitly adding loss terms or constraints on model architectures. Prior knowledge embedded in the physics computation itself rarely draws attention. We show that solving the Kohn-Sham equations when training neural networks for the exchange-correlation functional provides an implicit regularization that greatly improves generalization. Two separations suffice for learning the entire one-dimensional H$_2$ dissociation curve within chemical accuracy, including the strongly correlated region. Our models also generalize to unseen types of molecules and overcome self-interaction error.
Last year, at least 30,000 scientific papers used the Kohn-Sham scheme of density functional theory to solve electronic structure problems in a wide variety of scientific fields, ranging from materials science to biochemistry to astrophysics. Machine learning holds the promise of learning the kinetic energy functional via examples, by-passing the need to solve the Kohn-Sham equations. This should yield substantial savings in computer time, allowing either larger systems or longer time-scales to be tackled, but attempts to machine-learn this functional have been limited by the need to find its derivative. The present work overcomes this difficulty by directly learning the density-potential and energy-density maps for test systems and various molecules. Both improved accuracy and lower computational cost with this method are demonstrated by reproducing DFT energies for a range of molecular geometries generated during molecular dynamics simulations. Moreover, the methodology could be applied directly to quantum chemical calculations, allowing construction of density functionals of quantum-chemical accuracy.
A Kohn-Sham (KS) inversion determines a KS potential and orbitals corresponding to a given electron density, a procedure that has applications in developing and evaluating functionals used in density functional theory. Despite the utility of KS
In high temperature density functional theory simulations (from tens of eV to keV) the total number of Kohn-Sham orbitals is a critical quantity to get accurate results. To establish the relationship between the number of orbitals and the level of occupation of the highest orbital, we derived a model based on the electron gas properties at finite temperature. This model predicts the total number of orbitals required to reach a given level of occupation and thus a stipulated precision. Levels of occupation as low as 10-4, and below, must be considered to get converged results better than 1%, making high temperature simulations very time consuming beyond a few tens of eV. After assessing the predictions of the model against previous results and ABINIT minimizations, we show how the extended FPMD method of Zhang et al. [PoP 23 042707, 2016] allows to bypass these strong constraints on the number of orbitals at high temperature.
A detailed account of the Kohn-Sham algorithm from quantum chemistry, formulated rigorously in the very general setting of convex analysis on Banach spaces, is given here. Starting from a Levy-Lieb-type functional, its convex and lower semi-continuous extension is regularized to obtain differentiability. This extra layer allows to rigorously introduce, in contrast to the common unregularized approach, a well-defined Kohn-Sham iteration scheme. Convergence in a weak sense is then proven. This generalized formulation is applicable to a wide range of different density-functional theories and possibly even to models outside of quantum mechanics.
Machine translation (MT) systems translate text between different languages by automatically learning in-depth knowledge of bilingual lexicons, grammar and semantics from the training examples. Although neural machine translation (NMT) has led the field of MT, we have a poor understanding on how and why it works. In this paper, we bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table -- an interpretable table of bilingual lexicons. We extract the phrase table from the training examples that an NMT model correctly predicts. Extensive experiments on widely-used datasets show that the phrase table is reasonable and consistent against language pairs and random seeds. Equipped with the interpretable phrase table, we find that NMT models learn patterns from simple to complex and distill essential bilingual knowledge from the training examples. We also revisit some advances that potentially affect the learning of bilingual knowledge (e.g., back-translation), and report some interesting findings. We believe this work opens a new angle to interpret NMT with statistic models, and provides empirical supports for recent advances in improving NMT models.