أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Gusztav Morvai

264 - Gusztav Morvai , Benjamin Weiss 2008

For a stationary stochastic process ${X_n}$ with values in some set $A$, a finite word $w in A^K$ is called a memory word if the conditional probability of $X_0$ given the past is constant on the cylinder set defined by $X_{-K}^{-1}=w$. It is a calle d a minimal memory word if no proper suffix of $w$ is also a memory word. For example in a $K$-step Markov processes all words of length $K$ are memory words but not necessarily minimal. We consider the problem of determining the lengths of the longest minimal memory words and the shortest memory words of an unknown process ${X_n}$ based on sequentially observing the outputs of a single sample ${xi_1,xi_2,...xi_n}$. We will give a universal estimator which converges almost surely to the length of the longest minimal memory word and show that no such universal estimator exists for the length of the shortest memory word. The alphabet $A$ may be finite or countable.

نظرية المعلومات نظرية المعلومات

A simple randomized algorithm for sequential prediction of ergodic time series

114 - L. Gyorfi , G. Lugosi , G. Morvai 2008

We present a simple randomized procedure for the prediction of a binary sequence. The algorithm uses ideas from recent developments of the theory of the prediction of individual sequences. We show that if the sequence is a realization of a stationary and ergodic random process then the average number of mistakes converges, almost surely, to that of the optimum, given by the Bayes predictor. The desirable finite-sample properties of the predictor are illustrated by its performance for Markov processes. In such cases the predictor exhibits near optimal behavior even without knowing the order of the Markov process. Prediction with side information is also considered.

نظرية الإحصاء نظرية المعلومات نظرية المعلومات

Weakly Convergent Nonparametric Forecasting of Stationary Time Series

243 - G. Morvai , S. Yakowitz , P. Algoet 2008

The conditional distribution of the next outcome given the infinite past of a stationary process can be inferred from finite but growing segments of the past. Several schemes are known for constructing pointwise consistent estimates, but they all dem and prohibitive amounts of input data. In this paper we consider real-valued time series and construct conditional distribution estimates that make much more efficient use of the input data. The estimates are consistent in a weak sense, and the question whether they are pointwise consistent is still open. For finite-alphabet processes one may rely on a universal data compression scheme like the Lempel-Ziv algorithm to construct conditional probability mass function estimates that are consistent in expected information divergence. Consistency in this strong sense cannot be attained in a universal sense for all stationary processes with values in an infinite alphabet, but weak consistency can. Some applications of the estimates to on-line forecasting, regression and classification are discussed.

نظرية الإحصاء نظرية المعلومات نظرية المعلومات

On Sequential Estimation and Prediction for Discrete Time Series

97 - G. Morvai , B. Weiss 2008

The problem of extracting as much information as possible from a sequence of observations of a stationary stochastic process $X_0,X_1,...X_n$ has been considered by many authors from different points of view. It has long been known through the work o f D. Bailey that no universal estimator for $textbf{P}(X_{n+1}|X_0,X_1,...X_n)$ can be found which converges to the true estimator almost surely. Despite this result, for restricted classes of processes, or for sequences of estimators along stopping times, universal estimators can be found. We present here a survey of some of the recent work that has been done along these lines.