No Arabic abstract
This paper is a strongly geometrical approach to the Fisher distance, which is a measure of dissimilarity between two probability distribution functions. The Fisher distance, as well as other divergence measures, are also used in many applications to establish a proper data average. The main purpose is to widen the range of possible interpretations and relations of the Fisher distance and its associated geometry for the prospective applications. It focuses on statistical models of the normal probability distribution functions and takes advantage of the connection with the classical hyperbolic geometry to derive closed forms for the Fisher distance in several cases. Connections with the well-known Kullback-Leibler divergence measure are also devised.
For a known weak signal in additive white noise, the asymptotic performance of a locally optimum processor (LOP) is shown to be given by the Fisher information (FI) of a standardized even probability density function (PDF) of noise in three cases: (i) the maximum signal-to-noise ratio (SNR) gain for a periodic signal; (ii) the optimal asymptotic relative efficiency (ARE) for signal detection; (iii) the best cross-correlation gain (CG) for signal transmission. The minimal FI is unity, corresponding to a Gaussian PDF, whereas the FI is certainly larger than unity for any non-Gaussian PDFs. In the sense of a realizable LOP, it is found that the dichotomous noise PDF possesses an infinite FI for known weak signals perfectly processed by the corresponding LOP. The significance of FI lies in that it provides a upper bound for the performance of locally optimum processing.
Complexity measures in the context of the Integrated Information Theory of consciousness try to quantify the strength of the causal connections between different neurons. This is done by minimizing the KL-divergence between a full system and one without causal connections. Various measures have been proposed and compared in this setting. We will discuss a class of information geometric measures that aim at assessing the intrinsic causal influences in a system. One promising candidate of these measures, denoted by $Phi_{CIS}$, is based on conditional independence statements and does satisfy all of the properties that have been postulated as desirable. Unfortunately it does not have a graphical representation which makes it less intuitive and difficult to analyze. We propose an alternative approach using a latent variable which models a common exterior influence. This leads to a measure $Phi_{CII}$, Causal Information Integration, that satisfies all of the required conditions. Our measure can be calculated using an iterative information geometric algorithm, the em-algorithm. Therefore we are able to compare its behavior to existing integrated information measures.
In this paper we introduce a new generalisation of the relative Fisher Information for Markov jump processes on a finite or countable state space, and prove an inequality which connects this object with the relative entropy and a large deviation rate functional. In addition to possessing various favourable properties, we show that this generalised Fisher Information converges to the classical Fisher Information in an appropriate limit. We then use this generalised Fisher Information and the aforementioned inequality to qualitatively study coarse-graining problems for jump processes on discrete spaces.
Real-world data typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of these features. Using the fewest features but still retaining sufficient information about the system is crucial in many statistical learning approaches, particularly when data are sparse. We introduce a statistical test that can assess the relative information retained when using two different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This in turn allows finding the most informative distance measure out of a pool of candidates. The approach is applied to find the most relevant policy variables for controlling the Covid-19 epidemic and to find compact yet informative representations of atomic structures, but its potential applications are wide ranging in many branches of science.
Matrix scaling is a classical problem with a wide range of applications. It is known that the Sinkhorn algorithm for matrix scaling is interpreted as alternating e-projections from the viewpoint of classical information geometry. Recently, a generalization of matrix scaling to completely positive maps called operator scaling has been found to appear in various fields of mathematics and computer science, and the Sinkhorn algorithm has been extended to operator scaling. In this study, the operator Sinkhorn algorithm is studied from the viewpoint of quantum information geometry through the Choi representation of completely positive maps. The operator Sinkhorn algorithm is shown to coincide with alternating e-projections with respect to the symmetric logarithmic derivative metric, which is a Riemannian metric on the space of quantum states relevant to quantum estimation theory. Other types of alternating e-projections algorithms are also provided by using different information geometric structures on the positive definite cone.