No Arabic abstract
The spectral gap $gamma$ of a finite, ergodic, and reversible Markov chain is an important parameter measuring the asymptotic rate of convergence. In applications, the transition matrix $P$ may be unknown, yet one sample of the chain up to a fixed time $n$ may be observed. We consider here the problem of estimating $gamma$ from this data. Let $pi$ be the stationary distribution of $P$, and $pi_star = min_x pi(x)$. We show that if $n = tilde{O}bigl(frac{1}{gamma pi_star}bigr)$, then $gamma$ can be estimated to within multiplicative constants with high probability. When $pi$ is uniform on $d$ states, this matches (up to logarithmic correction) a lower bound of $tilde{Omega}bigl(frac{d}{gamma}bigr)$ steps required for precise estimation of $gamma$. Moreover, we provide the first procedure for computing a fully data-dependent interval, from a single finite-length trajectory of the chain, that traps the mixing time $t_{text{mix}}$ of the chain at a prescribed confidence level. The interval does not require the knowledge of any parameters of the chain. This stands in contrast to previous approaches, which either only provide point estimates, or require a reset mechanism, or additional prior knowledge. The interval is constructed around the relaxation time $t_{text{relax}} = 1/gamma$, which is strongly related to the mixing time, and the width of the interval converges to zero roughly at a $1/sqrt{n}$ rate, where $n$ is the length of the sample path.
We introduce a new nonparametric density estimator inspired by Markov Chains, and generalizing the well-known Kernel Density Estimator (KDE). Our estimator presents several benefits with respect to the usual ones and can be used straightforwardly as a foundation in all density-based algorithms. We prove the consistency of our estimator and we find it typically outperforms KDE in situations of large sample size and high dimensionality. We also employ our density estimator to build a local outlier detector, showing very promising results when applied to some realistic datasets.
The spectral gap $gamma$ of an ergodic and reversible Markov chain is an important parameter measuring the asymptotic rate of convergence. In applications, the transition matrix $P$ may be unknown, yet one sample of the chain up to a fixed time $t$ may be observed. Hsu, Kontorovich, and Szepesvari (2015) considered the problem of estimating $gamma$ from this data. Let $pi$ be the stationary distribution of $P$, and $pi_star = min_x pi(x)$. They showed that, if $t = tilde{O}bigl(frac{1}{gamma^3 pi_star}bigr)$, then $gamma$ can be estimated to within multiplicative constants with high probability. They also proved that $tilde{Omega}bigl(frac{n}{gamma}bigr)$ steps are required for precise estimation of $gamma$. We show that $tilde{O}bigl(frac{1}{gamma pi_star}bigr)$ steps of the chain suffice to estimate $gamma$ up to multiplicative constants with high probability. When $pi$ is uniform, this matches (up to logarithmic corrections) the lower bound of Hsu, Kontorovich, and Szepesvari.
In this paper, we have established a unified framework of multistage parameter estimation. We demonstrate that a wide variety of statistical problems such as fixed-sample-size interval estimation, point estimation with error control, bounded-width confidence intervals, interval estimation following hypothesis testing, construction of confidence sequences, can be cast into the general framework of constructing sequential random intervals with prescribed coverage probabilities. We have developed exact methods for the construction of such sequential random intervals in the context of multistage sampling. In particular, we have established inclusion principle and coverage tuning techniques to control and adjust the coverage probabilities of sequential random intervals. We have obtained concrete sampling schemes which are unprecedentedly efficient in terms of sampling effort as compared to existing procedures.
Consider the case that one observes a single time-series, where at each time t one observes a data record O(t) involving treatment nodes A(t), possible covariates L(t) and an outcome node Y(t). The data record at time t carries information for an (potentially causal) effect of the treatment A(t) on the outcome Y(t), in the context defined by a fixed dimensional summary measure Co(t). We are concerned with defining causal effects that can be consistently estimated, with valid inference, for sequentially randomized experiments without further assumptions. More generally, we consider the case when the (possibly causal) effects can be estimated in a double robust manner, analogue to double robust estimation of effects in the i.i.d. causal inference literature. We propose a general class of averages of conditional (context-specific) causal parameters that can be estimated in a double robust manner, therefore fully utilizing the sequential randomization. We propose a targeted maximum likelihood estimator (TMLE) of these causal parameters, and present a general theorem establishing the asymptotic consistency and normality of the TMLE. We extend our general framework to a number of typically studied causal target parameters, including a sequentially adaptive design within a single unit that learns the optimal treatment rule for the unit over time. Our work opens up robust statistical inference for causal questions based on observing a single time-series on a particular unit.
Many modern techniques employed in physics, such a computation of path integrals, rely on random walks on graphs that can be represented as Markov chains. Traditionally, estimates of running times of such sampling algorithms are computed using the number of steps in the chain needed to reach the stationary distribution. This quantity is generally defined as mixing time and is often difficult to compute. In this paper, we suggest an alternative estimate based on the Kolmogorov-Sinai entropy, by establishing a link between the maximization of KSE and the minimization of the mixing time. Since KSE are easier to compute in general than mixing time, this link provides a new faster method to approximate the minimum mixing time that could be interesting in computer sciences and statistical physics. Beyond this, our finding will also be of interest to the out-of-equilibrium community, by providing a new rational to select stationary states in out-of-equilibrium physics: it seems reasonable that in a physical system with two simultaneous equiprobable possible dynamics, the final stationary state will be closer to the stationary state corresponding to the fastest dynamics (smallest mixing time).Through the empirical link found in this letter, this state will correspond to a state of maximal Kolmogorov-Sinai entropy. If this is true, this would provide a more satisfying rule for selecting stationary states in complex systems such as climate than the maximization of the entropy production.