No Arabic abstract
Consider the case that one observes a single time-series, where at each time t one observes a data record O(t) involving treatment nodes A(t), possible covariates L(t) and an outcome node Y(t). The data record at time t carries information for an (potentially causal) effect of the treatment A(t) on the outcome Y(t), in the context defined by a fixed dimensional summary measure Co(t). We are concerned with defining causal effects that can be consistently estimated, with valid inference, for sequentially randomized experiments without further assumptions. More generally, we consider the case when the (possibly causal) effects can be estimated in a double robust manner, analogue to double robust estimation of effects in the i.i.d. causal inference literature. We propose a general class of averages of conditional (context-specific) causal parameters that can be estimated in a double robust manner, therefore fully utilizing the sequential randomization. We propose a targeted maximum likelihood estimator (TMLE) of these causal parameters, and present a general theorem establishing the asymptotic consistency and normality of the TMLE. We extend our general framework to a number of typically studied causal target parameters, including a sequentially adaptive design within a single unit that learns the optimal treatment rule for the unit over time. Our work opens up robust statistical inference for causal questions based on observing a single time-series on a particular unit.
The current work is motivated by the need for robust statistical methods for precision medicine; as such, we address the need for statistical methods that provide actionable inference for a single unit at any point in time. We aim to learn an optimal, unknown choice of the controlled components of the design in order to optimize the expected outcome; with that, we adapt the randomization mechanism for future time-point experiments based on the data collected on the individual over time. Our results demonstrate that one can learn the optimal rule based on a single sample, and thereby adjust the design at any point t with valid inference for the mean target parameter. This work provides several contributions to the field of statistical precision medicine. First, we define a general class of averages of conditional causal parameters defined by the current context for the single unit time-series data. We define a nonparametric model for the probability distribution of the time-series under few assumptions, and aim to fully utilize the sequential randomization in the estimation procedure via the double robust structure of the efficient influence curve of the proposed target parameter. We present multiple exploration-exploitation strategies for assigning treatment, and methods for estimating the optimal rule. Lastly, we present the study of the data-adaptive inference on the mean under the optimal treatment rule, where the target parameter adapts over time in response to the observed context of the individual. Our target parameter is pathwise differentiable with an efficient influence function that is doubly robust - which makes it easier to estimate than previously proposed variations. We characterize the limit distribution of our estimator under a Donsker condition expressed in terms of a notion of bracketing entropy adapted to martingale settings.
Several novel statistical methods have been developed to estimate large integrated volatility matrices based on high-frequency financial data. To investigate their asymptotic behaviors, they require a sub-Gaussian or finite high-order moment assumption for observed log-returns, which cannot account for the heavy tail phenomenon of stock returns. Recently, a robust estimator was developed to handle heavy-tailed distributions with some bounded fourth-moment assumption. However, we often observe that log-returns have heavier tail distribution than the finite fourth-moment and that the degrees of heaviness of tails are heterogeneous over the asset and time period. In this paper, to deal with the heterogeneous heavy-tailed distributions, we develop an adaptive robust integrated volatility estimator that employs pre-averaging and truncation schemes based on jump-diffusion processes. We call this an adaptive robust pre-averaging realized volatility (ARP) estimator. We show that the ARP estimator has a sub-Weibull tail concentration with only finite 2$alpha$-th moments for any $alpha>1$. In addition, we establish matching upper and lower bounds to show that the ARP estimation procedure is optimal. To estimate large integrated volatility matrices using the approximate factor model, the ARP estimator is further regularized using the principal orthogonal complement thresholding (POET) method. The numerical study is conducted to check the finite sample performance of the ARP estimator.
We address in this study the problem of learning a summary causal graph on time series with potentially different sampling rates. To do so, we first propose a new temporal mutual information measure defined on a window-based representation of time series. We then show how this measure relates to an entropy reduction principle that can be seen as a special case of the Probabilistic Raising Principle. We finally combine these two ingredients in a PC-like algorithm to construct the summary causal graph. This algorithm is evaluated on several datasets that shows both its efficacy and efficiency.
We consider a bivariate time series $(X_t,Y_t)$ that is given by a simple linear autoregressive model. Assuming that the equations describing each variable as a linear combination of past values are considered structural equations, there is a clear meaning of how intervening on one particular $X_t$ influences $Y_{t}$ at later times $t>t$. In the present work, we describe conditions under which one can define a causal model between variables that are coarse-grained in time, thus admitting statements like `setting $X$ to $x$ changes $Y$ in a certain way without referring to specific time instances. We show that particularly simple statements follow in the frequency domain, thus providing meaning to interventions on frequencies.
The spectral gap $gamma$ of a finite, ergodic, and reversible Markov chain is an important parameter measuring the asymptotic rate of convergence. In applications, the transition matrix $P$ may be unknown, yet one sample of the chain up to a fixed time $n$ may be observed. We consider here the problem of estimating $gamma$ from this data. Let $pi$ be the stationary distribution of $P$, and $pi_star = min_x pi(x)$. We show that if $n = tilde{O}bigl(frac{1}{gamma pi_star}bigr)$, then $gamma$ can be estimated to within multiplicative constants with high probability. When $pi$ is uniform on $d$ states, this matches (up to logarithmic correction) a lower bound of $tilde{Omega}bigl(frac{d}{gamma}bigr)$ steps required for precise estimation of $gamma$. Moreover, we provide the first procedure for computing a fully data-dependent interval, from a single finite-length trajectory of the chain, that traps the mixing time $t_{text{mix}}$ of the chain at a prescribed confidence level. The interval does not require the knowledge of any parameters of the chain. This stands in contrast to previous approaches, which either only provide point estimates, or require a reset mechanism, or additional prior knowledge. The interval is constructed around the relaxation time $t_{text{relax}} = 1/gamma$, which is strongly related to the mixing time, and the width of the interval converges to zero roughly at a $1/sqrt{n}$ rate, where $n$ is the length of the sample path.