No Arabic abstract
The extraction of a physical law y=yo(x) from joint experimental data about x and y is treated. The joint, the marginal and the conditional probability density functions (PDF) are expressed by given data over an estimator whose kernel is the instrument scattering function. As an optimal estimator of yo(x) the conditional average is proposed. The analysis of its properties is based upon a new definition of prediction quality. The joint experimental information and the redundancy of joint measurements are expressed by the relative entropy. With the number of experiments the redundancy on average increases, while the experimental information converges to a certain limit value. The difference between this limit value and the experimental information at a finite number of data represents the discrepancy between the experimentally determined and the true properties of the phenomenon. The sum of the discrepancy measure and the redundancy is utilized as a cost function. By its minimum a reasonable number of data for the extraction of the law yo(x) is specified. The mutual information is defined by the marginal and the conditional PDFs of the variables. The ratio between mutual information and marginal information is used to indicate which variable is the independent one. The properties of the introduced statistics are demonstrated on deterministically and randomly related variables.
A physical law is represented by the probability distribution of a measured variable. The probability density is described by measured data using an estimator whose kernel is the instrument scattering function. The experimental information and data redundancy are defined in terms of information entropy. The model cost function, comprised of data redundancy and estimation error, is minimized by the creation-annihilation process.
Redundancy of experimental data is the basic statistic from which the complexity of a natural phenomenon and the proper number of experiments needed for its exploration can be estimated. The redundancy is expressed by the entropy of information pertaining to the probability density function of experimental variables. Since the calculation of entropy is inconvenient due to integration over a range of variables, an approximate expression for redundancy is derived that includes only a sum over the set of experimental data about these variables. The approximation makes feasible an efficient estimation of the redundancy of data along with the related experimental information and information cost function. From the experimental information the complexity of the phenomenon can be simply estimated, while the proper number of experiments needed for its exploration can be determined from the minimum of the cost function. The performance of the approximate estimation of these statistics is demonstrated on two-dimensional normally distributed random data.
Statistical modeling of experimental physical laws is based on the probability density function of measured variables. It is expressed by experimental data via a kernel estimator. The kernel is determined objectively by the scattering of data during
Elucidating physical mechanisms with statistical confidence from molecular dynamics simulations can be challenging owing to the many degrees of freedom that contribute to collective motions. To address this issue, we recently introduced a dynamical Galerkin approximation (DGA) [Thiede et al. J. Phys. Chem. 150, 244111 (2019)], in which chemical kinetic statistics that satisfy equations of dynamical operators are represented by a basis expansion. Here, we reformulate this approach, clarifying (and reducing) the dependence on the choice of lag time. We present a new projection of the reactive current onto collective variables and provide improved estimators for rates and committors. We also present simple procedures for constructing suitable smoothly varying basis functions from arbitrary molecular features. To evaluate estimators and basis sets numerically, we generate and carefully validate a dataset of short trajectories for the unfolding and folding of the trp-cage miniprotein, a well-studied system. Our analysis demonstrates a comprehensive strategy for characterizing reaction pathways quantitatively.
A subtractionless method for solving Fermi surface sheets ({tt FSS}), from measured $n$-axis-projected momentum distribution histograms by two-dimensional angular correlation of the positron-electron annihilation radiation ({tt 2D-ACAR}) technique, is discussed. The window least squares statistical noise smoothing filter described in Adam {sl et al.}, NIM A, {bf 337} (1993) 188, is first refined such that the window free radial parameters ({tt WRP}) are optimally adapted to the data. In an ideal single crystal, the specific jumps induced in the {tt WRP} distribution by the existing Fermi surface jumps yield straightforward information on the resolved {tt FSS}. In a real crystal, the smearing of the derived {tt WRP} optimal values, which originates from positron annihilations with electrons at crystal imperfections, is ruled out by median smoothing of the obtained distribution, over symmetry defined stars of bins. The analysis of a gigacount {tt 2D-ACAR} spectrum, measured on the archetypal high-$T_c$ compound $YBasb{2}Cusb{3}Osb{7-delta}$ at room temperature, illustrates the method. Both electronic {tt FSS}, the ridge along $Gamma X$ direction and the pillbox centered at the $S$ point of the first Brillouin zone, are resolved.