No Arabic abstract
We provide an Information-Geometric formulation of Classical Mechanics on the Riemannian manifold of probability distributions, which is an affine manifold endowed with a dually-flat connection. In a non-parametric formalism, we consider the full set of positive probability functions on a finite sample space, and we provide a specific expression for the tangent and cotangent spaces over the statistical manifold, in terms of a Hilbert bundle structure that we call the Statistical Bundle. In this setting, we compute velocities and accelerations of a one-dimensional statistical model using the canonical dual pair of parallel transports and define a coherent formalism for Lagrangian and Hamiltonian mechanics on the bundle. Finally, in a series of examples, we show how our formalism provides a consistent framework for accelerated natural gradient dynamics on the probability simplex, paving the way for direct applications in optimization, game theory and neural networks.
A framework for statistical-mechanical analysis of quantum Hamiltonians is introduced. The approach is based upon a gradient flow equation in the space of Hamiltonians such that the eigenvectors of the initial Hamiltonian evolve toward those of the reference Hamiltonian. The nonlinear double-bracket equation governing the flow is such that the eigenvalues of the initial Hamiltonian remain unperturbed. The space of Hamiltonians is foliated by compact invariant subspaces, which permits the construction of statistical distributions over the Hamiltonians. In two dimensions, an explicit dynamical model is introduced, wherein the density function on the space of Hamiltonians approaches an equilibrium state characterised by the canonical ensemble. This is used to compute quenched and annealed averages of quantum observables.
The main result of this note is a characterization of the Poisson commutativity of Hamilton functions in terms of their principal action functions.
We show that there exists an underlying manifold with a conformal metric and compatible connection form, and a metric type Hamiltonian (which we call the geometrical picture) that can be put into correspondence with the usual Hamilton-Lagrange mechanics. The requirement of dynamical equivalence of the two types of Hamiltonians, that the momenta generated by the two pictures be equal for all times, is sufficient to determine an expansion of the conformal factor, defined on the geometrical coordinate representation, in its domain of analyticity with coefficients to all orders determined by functions of the potential of the Hamilton-Lagrange picture, defined on the Hamilton-Lagrange coordinate representation, and its derivatives. Conversely, if the conformal function is known, the potential of a Hamilton-Lagrange picture can be determined in a similar way. We show that arbitrary local variations of the orbits in the Hamilton-Lagrange picture can be generated by variations along geodesics in the geometrical picture and establish a correspondence which provides a basis for understanding how the instability in the geometrical picture is manifested in the instability of the original Hamiltonian motion.
We generalize standard credal set models for imprecise probabilities to include higher order credal sets -- confidences about confidences. In doing so, we specify how an agents higher order confidences (credal sets) update upon observing an event. Our model begins to address standard issues with imprecise probability models, like Dilation and Belief Inertia. We conjecture that when higher order credal sets contain all possible probability functions, then in the limiting case the highest order confidences converge to form a uniform distribution over the first order credal set, where we define uniformity in terms of the statistical distance metric (total variation distance). Finite simulation supports the conjecture. We further suggest that this convergence presents the total-variation-uniform distribution as a natural, privileged prior for statistical hypothesis testing.
A central issue of many statistical learning problems is to select an appropriate model from a set of candidate models. Large models tend to inflate the variance (or overfitting), while small models tend to cause biases (or underfitting) for a given fixed dataset. In this work, we address the critical challenge of model selection to strike a balance between model fitting and model complexity, thus gaining reliable predictive power. We consider the task of approaching the theoretical limit of statistical learning, meaning that the selected model has the predictive performance that is as good as the best possible model given a class of potentially misspecified candidate models. We propose a generalized notion of Takeuchis information criterion and prove that the proposed method can asymptotically achieve the optimal out-sample prediction loss under reasonable assumptions. It is the first proof of the asymptotic property of Takeuchis information criterion to our best knowledge. Our proof applies to a wide variety of nonlinear models, loss functions, and high dimensionality (in the sense that the models complexity can grow with sample size). The proposed method can be used as a computationally efficient surrogate for leave-one-out cross-validation. Moreover, for modeling streaming data, we propose an online algorithm that sequentially expands the model complexity to enhance selection stability and reduce computation cost. Experimental studies show that the proposed method has desirable predictive power and significantly less computational cost than some popular methods.