Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Optimal prediction of Markov chains with and without spectral gap

105 0 0.0 ( 0 )

Download Cite

Added by Soham Jana

Publication date 2021

fields Mathematical Statistics

and research's language is English

Authors Yanjun Han - Soham Jana - Yihong Wu

Statistics Theory Statistics Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We study the following learning problem with dependent data: Observing a trajectory of length $n$ from a stationary Markov chain with $k$ states, the goal is to predict the next state. For $3 leq k leq O(sqrt{n})$, using techniques from universal compression, the optimal prediction risk in Kullback-Leibler divergence is shown to be $Theta(frac{k^2}{n}log frac{n}{k^2})$, in contrast to the optimal rate of $Theta(frac{log log n}{n})$ for $k=2$ previously shown in Falahatgar et al., 2016. These rates, slower than the parametric rate of $O(frac{k^2}{n})$, can be attributed to the memory in the data, as the spectral gap of the Markov chain can be arbitrarily small. To quantify the memory effect, we study irreducible reversible chains with a prescribed spectral gap. In addition to characterizing the optimal prediction risk for two states, we show that, as long as the spectral gap is not excessively small, the prediction risk in the Markov model is $O(frac{k^2}{n})$, which coincides with that of an iid model with the same number of parameters.

rate research

Markov chain Monte Carlo test of toric homogeneous Markov chains

636 - Akimichi Takemura , Hisayuki Hara 2010

Markov chain models are used in various fields, such behavioral sciences or econometrics. Although the goodness of fit of the model is usually assessed by large sample approximation, it is desirable to use conditional tests if the sample size is not large. We study Markov bases for performing conditional tests of the toric homogeneous Markov chain model, which is the envelope exponential family for the usual homogeneous Markov chain model. We give a complete description of a Markov basis for the following cases: i) two-state, arbitrary length, ii) arbitrary finite state space and length of three. The general case remains to be a conjecture. We also present a numerical example of conditional tests based on our Markov basis.

Statistics Theory Statistics Theory

Bernsteins inequality for general Markov chains

72 - Bai Jiang , Qiang Sun , Jianqing Fan 2018

We establish Bernstein inequalities for functions of general (general-state-space, not necessarily reversible) Markov chains. These inequalities achieve sharp variance proxies and recover the classical Bernsteins inequality under independence. The key analysis lies in upper bounding the operator norm of a perturbed Markov transition kernel by the limiting operator norm of a sequence of finite-rank and perturbed Markov transition kernels. For each finite-rank and perturbed Markov kernel, we bound its norm by the sum of two convex functions. One coincides with what delivers the classical Bernsteins inequality, and the other reflects the influence of the Markov dependence. A convex analysis on conjugates of these two functions then derives our Bernstein inequalities.

Statistics Theory Statistics Theory

Hoeffdings lemma for Markov Chains and its applications to statistical learning

118 - Jianqing Fan , Bai Jiang , Qiang Sun 2018

We extend Hoeffdings lemma to general-state-space and not necessarily reversible Markov chains. Let ${X_i}_{i ge 1}$ be a stationary Markov chain with invariant measure $pi$ and absolute spectral gap $1-lambda$, where $lambda$ is defined as the operator norm of the transition kernel acting on mean zero and square-integrable functions with respect to $pi$. Then, for any bounded functions $f_i: x mapsto [a_i,b_i]$, the sum of $f_i(X_i)$ is sub-Gaussian with variance proxy $frac{1+lambda}{1-lambda} cdot sum_i frac{(b_i-a_i)^2}{4}$. This result differs from the classical Hoeffdings lemma by a multiplicative coefficient of $(1+lambda)/(1-lambda)$, and simplifies to the latter when $lambda = 0$. The counterpart of Hoeffdings inequality for Markov chains immediately follows. Our results assume none of countable state space, reversibility and time-homogeneity of Markov chains and cover time-dependent functions with various ranges. We illustrate the utility of these results by applying them to six problems in statistics and machine learning.

Statistics Theory Statistics Theory

A Markov basis for two-state toric homogeneous Markov chain model without initial parameters

503 - Hisayuki Hara , Akimichi Takemura 2010

We derive a Markov basis consisting of moves of degree at most three for two-state toric homogeneous Markov chain model of arbitrary length without parameters for initial states. Our basis consists of moves of degree three and degree one, which alter the initial frequencies, in addition to moves of degree two and degree one for toric homogeneous Markov chain model with parameters for initial states.

Statistics Theory Statistics Theory

Batch means and spectral variance estimators in Markov chain Monte Carlo

549 - James M. Flegal , Galin L. Jones 2010

Calculating a Monte Carlo standard error (MCSE) is an important step in the statistical analysis of the simulation output obtained from a Markov chain Monte Carlo experiment. An MCSE is usually based on an estimate of the variance of the asymptotic normal distribution. We consider spectral and batch means methods for estimating this variance. In particular, we establish conditions which guarantee that these estimators are strongly consistent as the simulation effort increases. In addition, for the batch means and overlapping batch means methods we establish conditions ensuring consistency in the mean-square sense which in turn allows us to calculate the optimal batch size up to a constant of proportionality. Finally, we examine the empirical finite-sample properties of spectral variance and batch means estimators and provide recommendations for practitioners.

Statistics Theory Statistics Theory

comments

Fetching comments

University of Assiut

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Optimal prediction of Markov chains with and without spectral gap

Ask ChatGPT about the research

No Arabic abstract

Read More