No Arabic abstract
In this work a method for statistical analysis of time series is proposed, which is used to obtain solutions to some classical problems of mathematical statistics under the only assumption that the process generating the data is stationary ergodic. Namely, three problems are considered: goodness-of-fit (or identity) testing, process classification, and the change point problem. For each of the problems a test is constructed that is asymptotically accurate for the case when the data is generated by stationary ergodic processes. The tests are based on empirical estimates of distributional distance.
The setting is a stationary, ergodic time series. The challenge is to construct a sequence of functions, each based on only finite segments of the past, which together provide a strongly consistent estimator for the conditional probability of the next observation, given the infinite past. Ornstein gave such a construction for the case that the values are from a finite set, and recently Algoet extended the scheme to time series with coordinates in a Polish space. The present study relates a different solution to the challenge. The algorithm is simple and its verification is fairly transparent. Some extensions to regression, pattern recognition, and on-line forecasting are mentioned.
In recent years, infinite-dimensional methods have been introduced for the Gaussian channels estimation. The aim of this paper is to study the application of similar methods to Poisson channels. In particular we compute the Bayesian estimator of a Poisson channel using the likelihood ratio and the discrete Malliavin gradient. This algorithm is suitable for numerical implementation via the Monte-Carlo scheme. As an application we provide an new proof of the formula obtained recently by Guo, Shamai and Verduu relating some derivatives of the input-output mutual information of a time-continuous Poisson channel and the conditional mean estimator of the input. These results are then extended to mixed Gaussian-Poisson channels.
Nowadays data compressors are applied to many problems of text analysis, but many such applications are developed outside of the framework of mathematical statistics. In this paper we overcome this obstacle and show how several methods of classical mathematical statistics can be developed based on applications of the data compressors.
We study minimization of a parametric family of relative entropies, termed relative $alpha$-entropies (denoted $mathscr{I}_{alpha}(P,Q)$). These arise as redundancies under mismatched compression when cumulants of compressed lengths are considered instead of expected compressed lengths. These parametric relative entropies are a generalization of the usual relative entropy (Kullback-Leibler divergence). Just like relative entropy, these relative $alpha$-entropies behave like squared Euclidean distance and satisfy the Pythagorean property. Minimization of $mathscr{I}_{alpha}(P,Q)$ over the first argument on a set of probability distributions that constitutes a linear family is studied. Such a minimization generalizes the maximum R{e}nyi or Tsallis entropy principle. The minimizing probability distribution (termed $mathscr{I}_{alpha}$-projection) for a linear family is shown to have a power-law.
We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian Structural Inference (BSI) relies on a set of candidate unifilar HMM (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological epsilon-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be epsilon-machines, irrespective of estimated transition probabilities. Properties of epsilon-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSIs effectiveness in estimating a processs randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.