No Arabic abstract
Traditional maximum entropy and sparsity-based algorithms for analytic continuation often suffer from the ill-posed kernel matrix or demand tremendous computation time for parameter tuning. Here we propose a neural network method by convex optimization and replace the ill-posed inverse problem by a sequence of well-conditioned surrogate problems. After training, the learned optimizers are able to give a solution of high quality with low time cost and achieve higher parameter efficiency than heuristic full-connected networks. The output can also be used as a neural default model to improve the maximum entropy for better performance. Our methods may be easily extended to other high-dimensional inverse problems via large-scale pretraining.
Learned optimizers are increasingly effective, with performance exceeding that of hand designed optimizers such as Adam~citep{kingma2014adam} on specific tasks citep{metz2019understanding}. Despite the potential gains available, in current work the meta-training (or `outer-training) of the learned optimizer is performed by a hand-designed optimizer, or by an optimizer trained by a hand-designed optimizer citep{metz2020tasks}. We show that a population of randomly initialized learned optimizers can be used to train themselves from scratch in an online fashion, without resorting to a hand designed optimizer in any part of the process. A form of population based training is used to orchestrate this self-training. Although the randomly initialized optimizers initially make slow progress, as they improve they experience a positive feedback loop, and become rapidly more effective at training themselves. We believe feedback loops of this type, where an optimizer improves itself, will be important and powerful in the future of machine learning. These methods not only provide a path towards increased performance, but more importantly relieve research and engineering effort.
Learned optimizers are algorithms that can themselves be trained to solve optimization problems. In contrast to baseline optimizers (such as momentum or Adam) that use simple update rules derived from theoretical principles, learned optimizers use flexible, high-dimensional, nonlinear parameterizations. Although this can lead to better performance in certain settings, their inner workings remain a mystery. How is a learned optimizer able to outperform a well tuned baseline? Has it learned a sophisticated combination of existing optimization techniques, or is it implementing completely new behavior? In this work, we address these questions by careful analysis and visualization of learned optimizers. We study learned optimizers trained from scratch on three disparate tasks, and discover that they have learned interpretable mechanisms, including: momentum, gradient clipping, learning rate schedules, and a new form of learning rate adaptation. Moreover, we show how the dynamics of learned optimizers enables these behaviors. Our results help elucidate the previously murky understanding of how learned optimizers work, and establish tools for interpreting future learned optimizers.
A key goal of quantum chaos is to establish a relationship between widely observed universal spectral fluctuations of clean quantum systems and random matrix theory (RMT). For single particle systems with fully chaotic classical counterparts, the problem has been partly solved by Berry (1985) within the so-called diagonal approximation of semiclassical periodic-orbit sums. Derivation of the full RMT spectral form factor $K(t)$ from semiclassics has been completed only much later in a tour de force by Mueller et al (2004). In recent years, the questions of long-time dynamics at high energies, for which the full many-body energy spectrum becomes relevant, are coming at the forefront even for simple many-body quantum systems, such as locally interacting spin chains. Such systems display two universal types of behaviour which are termed as `many-body localized phase and `ergodic phase. In the ergodic phase, the spectral fluctuations are excellently described by RMT, even for very simple interactions and in the absence of any external source of disorder. Here we provide the first theoretical explanation for these observations. We compute $K(t)$ explicitly in the leading two orders in $t$ and show its agreement with RMT for non-integrable, time-reversal invariant many-body systems without classical counterparts, a generic example of which are Ising spin 1/2 models in a periodically kicking transverse field.
A method for analytic continuation of imaginary-time correlation functions (here obtained in quantum Monte Carlo simulations) to real-frequency spectral functions is proposed. Stochastically sampling a spectrum parametrized by a large number of delta-functions, treated as a statistical-mechanics problem, it avoids distortions caused by (as demonstrated here) configurational entropy in previous sampling methods. The key development is the suppression of entropy by constraining the spectral weight to within identifiable optimal bounds and imposing a set number of peaks. As a test case, the dynamic structure factor of the S=1/2 Heisenberg chain is computed. Very good agreement is found with Bethe Ansatz results in the ground state (including a sharp edge) and with exact diagonalization of small systems at elevated temperatures.
We explore the extended Koopmans theorem (EKT) within the phaseless auxiliary-field quantum Monte Carlo (AFQMC) method. The EKT allows for the direct calculation of electron addition and removal spectral functions using reduced density matrices of the $N$-particle system, and avoids the need for analytic continuation. The lowest level of EKT with AFQMC, called EKT1-AFQMC, is benchmarked using small molecules, 14-electron and 54-electron uniform electron gas supercells, and diamond at the $Gamma$-point. Via comparison with numerically exact results (when possible) and coupled-cluster methods, we find that EKT1-AFQMC can reproduce the qualitative features of spectral functions for Koopmans-like charge excitations with errors in peak locations of less than 0.25 eV in a finite basis. We also note the numerical difficulties that arise in the EKT1-AFQMC eigenvalue problem, especially when back-propagated quantities are very noisy. We show how a systematic higher order EKT approach can correct errors in EKT1-based theories with respect to the satellite region of the spectral function. Our work will be of use for the study of low-energy charge excitations and spectral functions in correlated molecules and solids where AFQMC can be reliably performed.