The subjects of the paper are the likelihood method (LM) and the expected Fisher information (FI) considered from the point od view of the construction of the physical models which originate in the statistical description of phenomena. The master equation case and structural information principle are derived. Then, the phenomenological description of the information transfer is presented. The extreme physical information (EPI) method is reviewed. As if marginal, the statistical interpretation of the amplitude of the system is given. The formalism developed in this paper would be also applied in quantum information processing and quantum game theory.
An approach based on the Fisher information (FI) is developed to quantify the maximum information gain and optimal experimental design in neutron reflectometry experiments. In these experiments, the FI can be analytically calculated and used to provide sub-second predictions of parameter uncertainties. This approach can be used to influence real-time decisions about measurement angle, measurement time, contrast choice and other experimental conditions based on parameters of interest. The FI provides a lower bound on parameter estimation uncertainties and these are shown to decrease with the square root of measurement time, providing useful information for the planning and scheduling of experimental work. As the FI is computationally inexpensive to calculate, it can be computed repeatedly during the course of an experiment, saving costly beam time by signalling that sufficient data has been obtained; or saving experimental datasets by signalling that an experiment needs to continue. The approachs predictions are validated through the introduction of an experiment simulation framework that incorporates instrument-specific incident flux profiles, and through the investigation of measuring the structural properties of a phospholipid bilayer.
This paper presents a statistical method to subtract background in maximum likelihood fit, without relying on any separate sideband or simulation for background modeling. The method, called sFit, is an extension to the sPlot technique originally developed to reconstruct true distribution for each date component. The sWeights defined for the sPlot technique allow to construct a modified likelihood function using only the signal probability density function and events in the signal region. Contribution of background events in the signal region to the likelihood function cancels out on a statistical basis. Maximizing this likelihood function leads to unbiased estimates of the fit parameters in the signal probability density function.
For a known weak signal in additive white noise, the asymptotic performance of a locally optimum processor (LOP) is shown to be given by the Fisher information (FI) of a standardized even probability density function (PDF) of noise in three cases: (i) the maximum signal-to-noise ratio (SNR) gain for a periodic signal; (ii) the optimal asymptotic relative efficiency (ARE) for signal detection; (iii) the best cross-correlation gain (CG) for signal transmission. The minimal FI is unity, corresponding to a Gaussian PDF, whereas the FI is certainly larger than unity for any non-Gaussian PDFs. In the sense of a realizable LOP, it is found that the dichotomous noise PDF possesses an infinite FI for known weak signals perfectly processed by the corresponding LOP. The significance of FI lies in that it provides a upper bound for the performance of locally optimum processing.
Exponential Random Graph Models (ERGMs) have gained increasing popularity over the years. Rooted into statistical physics, the ERGMs framework has been successfully employed for reconstructing networks, detecting statistically significant patterns in graphs, counting networked configurations with given properties. From a technical point of view, the ERGMs workflow is defined by two subsequent optimization steps: the first one concerns the maximization of Shannon entropy and leads to identify the functional form of the ensemble probability distribution that is maximally non-committal with respect to the missing information; the second one concerns the maximization of the likelihood function induced by this probability distribution and leads to its numerical determination. This second step translates into the resolution of a system of $O(N)$ non-linear, coupled equations (with $N$ being the total number of nodes of the network under analysis), a problem that is affected by three main issues, i.e. accuracy, speed and scalability. The present paper aims at addressing these problems by comparing the performance of three algorithms (i.e. Newtons method, a quasi-Newton method and a recently-proposed fixed-point recipe) in solving several ERGMs, defined by binary and weighted constraints in both a directed and an undirected fashion. While Newtons method performs best for relatively little networks, the fixed-point recipe is to be preferred when large configurations are considered, as it ensures convergence to the solution within seconds for networks with hundreds of thousands of nodes (e.g. the Internet, Bitcoin). We attach to the paper a Python code implementing the three aforementioned algorithms on all the ERGMs considered in the present work.
In this paper, we consider a surrogate modeling approach using a data-driven nonparametric likelihood function constructed on a manifold on which the data lie (or to which they are close). The proposed method represents the likelihood function using a spectral expansion formulation known as the kernel embedding of the conditional distribution. To respect the geometry of the data, we employ this spectral expansion using a set of data-driven basis functions obtained from the diffusion maps algorithm. The theoretical error estimate suggests that the error bound of the approximate data-driven likelihood function is independent of the variance of the basis functions, which allows us to determine the amount of training data for accurate likelihood function estimations. Supporting numerical results to demonstrate the robustness of the data-driven likelihood functions for parameter estimation are given on instructive examples involving stochastic and deterministic differential equations. When the dimension of the data manifold is strictly less than the dimension of the ambient space, we found that the proposed approach (which does not require the knowledge of the data manifold) is superior compared to likelihood functions constructed using standard parametric basis functions defined on the ambient coordinates. In an example where the data manifold is not smooth and unknown, the proposed method is more robust compared to an existing polynomial chaos surrogate model which assumes a parametric likelihood, the non-intrusive spectral projection.