ترغب بنشر مسار تعليمي؟ اضغط هنا

For a stationary stochastic process ${X_n}$ with values in some set $A$, a finite word $w in A^K$ is called a memory word if the conditional probability of $X_0$ given the past is constant on the cylinder set defined by $X_{-K}^{-1}=w$. It is a calle d a minimal memory word if no proper suffix of $w$ is also a memory word. For example in a $K$-step Markov processes all words of length $K$ are memory words but not necessarily minimal. We consider the problem of determining the lengths of the longest minimal memory words and the shortest memory words of an unknown process ${X_n}$ based on sequentially observing the outputs of a single sample ${xi_1,xi_2,...xi_n}$. We will give a universal estimator which converges almost surely to the length of the longest minimal memory word and show that no such universal estimator exists for the length of the shortest memory word. The alphabet $A$ may be finite or countable.
114 - L. Gyorfi , G. Lugosi , G. Morvai 2008
We present a simple randomized procedure for the prediction of a binary sequence. The algorithm uses ideas from recent developments of the theory of the prediction of individual sequences. We show that if the sequence is a realization of a stationary and ergodic random process then the average number of mistakes converges, almost surely, to that of the optimum, given by the Bayes predictor. The desirable finite-sample properties of the predictor are illustrated by its performance for Markov processes. In such cases the predictor exhibits near optimal behavior even without knowing the order of the Markov process. Prediction with side information is also considered.
The conditional distribution of the next outcome given the infinite past of a stationary process can be inferred from finite but growing segments of the past. Several schemes are known for constructing pointwise consistent estimates, but they all dem and prohibitive amounts of input data. In this paper we consider real-valued time series and construct conditional distribution estimates that make much more efficient use of the input data. The estimates are consistent in a weak sense, and the question whether they are pointwise consistent is still open. For finite-alphabet processes one may rely on a universal data compression scheme like the Lempel-Ziv algorithm to construct conditional probability mass function estimates that are consistent in expected information divergence. Consistency in this strong sense cannot be attained in a universal sense for all stationary processes with values in an infinite alphabet, but weak consistency can. Some applications of the estimates to on-line forecasting, regression and classification are discussed.
97 - G. Morvai , B. Weiss 2008
The problem of extracting as much information as possible from a sequence of observations of a stationary stochastic process $X_0,X_1,...X_n$ has been considered by many authors from different points of view. It has long been known through the work o f D. Bailey that no universal estimator for $textbf{P}(X_{n+1}|X_0,X_1,...X_n)$ can be found which converges to the true estimator almost surely. Despite this result, for restricted classes of processes, or for sequences of estimators along stopping times, universal estimators can be found. We present here a survey of some of the recent work that has been done along these lines.
Let ${(X_i,Y_i)}$ be a stationary ergodic time series with $(X,Y)$ values in the product space $R^dbigotimes R .$ This study offers what is believed to be the first strongly consistent (with respect to pointwise, least-squares, and uniform distance) algorithm for inferring $m(x)=E[Y_0|X_0=x]$ under the presumption that $m(x)$ is uniformly Lipschitz continuous. Auto-regression, or forecasting, is an important special case, and as such our work extends the literature of nonparametric, nonlinear forecasting by circumventing customary mixing assumptions. The work is motivated by a time series model in stochastic finance and by perspectives of its contribution to the issues of universal time series estimation.
94 - L. Gyorfi , G. Morvai , 2007
This study concerns problems of time-series forecasting under the weakest of assumptions. Related results are surveyed and are points of departure for the developments here, some of which are new and others are new derivations of previous findings. T he contributions in this study are all negative, showing that various plausible prediction problems are unsolvable, or in other cases, are not solvable by predictors which are known to be consistent when mixing conditions hold.
Finitarily Markovian processes are those processes ${X_n}_{n=-infty}^{infty}$ for which there is a finite $K$ ($K = K({X_n}_{n=-infty}^0$) such that the conditional distribution of $X_1$ given the entire past is equal to the conditional distribution of $X_1$ given only ${X_n}_{n=1-K}^0$. The least such value of $K$ is called the memory length. We give a rather complete analysis of the problems of universally estimating the least such value of $K$, both in the backward sense that we have just described and in the forward sense, where one observes successive values of ${X_n}$ for $n geq 0$ and asks for the least value $K$ such that the conditional distribution of $X_{n+1}$ given ${X_i}_{i=n-K+1}^n$ is the same as the conditional distribution of $X_{n+1}$ given ${X_i}_{i=-infty}^n$. We allow for finite or countably infinite alphabet size.
The forward estimation problem for stationary and ergodic time series ${X_n}_{n=0}^{infty}$ taking values from a finite alphabet ${cal X}$ is to estimate the probability that $X_{n+1}=x$ based on the observations $X_i$, $0le ile n$ without prior know ledge of the distribution of the process ${X_n}$. We present a simple procedure $g_n$ which is evaluated on the data segment $(X_0,...,X_n)$ and for which, ${rm error}(n) = |g_{n}(x)-P(X_{n+1}=x |X_0,...,X_n)|to 0$ almost surely for a subclass of all stationary and ergodic time series, while for the full class the Cesaro average of the error tends to zero almost surely and moreover, the error tends to zero in probability.
105 - G. Morvai , B. Weiss 2007
We describe estimators $chi_n(X_0,X_1,...,X_n)$, which when applied to an unknown stationary process taking values from a countable alphabet ${cal X}$, converge almost surely to $k$ in case the process is a $k$-th order Markov chain and to infinity otherwise.
138 - G. Morvai , B. Weiss 2007
Let ${X_n}$ be a stationary and ergodic time series taking values from a finite or countably infinite set ${cal X}$. Assume that the distribution of the process is otherwise unknown. We propose a sequence of stopping times $lambda_n$ along which we w ill be able to estimate the conditional probability $P(X_{lambda_n+1}=x|X_0,...,X_{lambda_n})$ from data segment $(X_0,...,X_{lambda_n})$ in a pointwise consistent way for a restricted class of stationary and ergodic finite or countably infinite alphabet time series which includes among others all stationary and ergodic finitarily Markovian processes. If the stationary and ergodic process turns out to be finitarily Markovian (among others, all stationary and ergodic Markov chains are included in this class) then $ lim_{nto infty} {nover lambda_n}>0$ almost surely. If the stationary and ergodic process turns out to possess finite entropy rate then $lambda_n$ is upperbounded by a polynomial, eventually almost surely.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا