No Arabic abstract
While there is considerable work on change point analysis in univariate time series, more and more data being collected comes from high dimensional multivariate settings. This paper introduces the asymptotic concept of high dimensional efficiency which quantifies the detection power of different statistics in such situations. While being related to classic asymptotic relative efficiency, it is different in that it provides the rate at which the change can get smaller with dimension while still being detectable. This also allows for comparisons of different methods with different null asymptotics as is for example the case in high-dimensional change point settings. Based on this new concept we investigate change point detection procedures using projections and develop asymptotic theory for how full panel (multivariate) tests compare with both oracle and random projections. Furthermore, for each given projection we can quantify a cone such that the corresponding projection statistic yields better power behavior if the true change direction is within this cone. The effect of misspecification of the covariance on the power of the tests is investigated, because in many high dimensional situations estimation of the full dependency (covariance) between the multivariate observations in the panel is often either computationally or even theoretically infeasible. It turns out that the projection statistic is much more robust in this respect in terms of size and somewhat more robust in terms of power. The theoretic quantification by the theory is accompanied by simulation results which confirm the theoretic (asymptotic) findings for surprisingly small samples. This shows in particular that the concept of high dimensional efficiency is indeed suitable to describe small sample power, and this is demonstrated in a multivariate example of market index data.
This manuscript makes two contributions to the field of change-point detection. In a general change-point setting, we provide a generic algorithm for aggregating local homogeneity tests into an estimator of change-points in a time series. Interestingly, we establish that the error rates of the collection of test directly translate into detection properties of the change-point estimator. This generic scheme is then applied to the problem of possibly sparse multivariate mean change-point detection setting. When the noise is Gaussian, we derive minimax optimal rates that are adaptive to the unknown sparsity and to the distance between change-points. For sub-Gaussian noise, we introduce a variant that is optimal in almost all sparsity regimes.
We consider a high-dimensional regression model with a possible change-point due to a covariate threshold and develop the Lasso estimator of regression coefficients as well as the threshold parameter. Our Lasso estimator not only selects covariates but also selects a model between linear and threshold regression models. Under a sparsity assumption, we derive non-asymptotic oracle inequalities for both the prediction risk and the $ell_1$ estimation loss for regression coefficients. Since the Lasso estimator selects variables simultaneously, we show that oracle inequalities can be established without pretesting the existence of the threshold effect. Furthermore, we establish conditions under which the estimation error of the unknown threshold parameter can be bounded by a nearly $n^{-1}$ factor even when the number of regressors can be much larger than the sample size ($n$). We illustrate the usefulness of our proposed estimation method via Monte Carlo simulations and an application to real data.
Let $(Y,(X_i)_{iinmathcal{I}})$ be a zero mean Gaussian vector and $V$ be a subset of $mathcal{I}$. Suppose we are given $n$ i.i.d. replications of the vector $(Y,X)$. We propose a new test for testing that $Y$ is independent of $(X_i)_{iin mathcal{I}backslash V}$ conditionally to $(X_i)_{iin V}$ against the general alternative that it is not. This procedure does not depend on any prior information on the covariance of $X$ or the variance of $Y$ and applies in a high-dimensional setting. It straightforwardly extends to test the neighbourhood of a Gaussian graphical model. The procedure is based on a model of Gaussian regression with random Gaussian covariates. We give non asymptotic properties of the test and we prove that it is rate optimal (up to a possible $log(n)$ factor) over various classes of alternatives under some additional assumptions. Besides, it allows us to derive non asymptotic minimax rates of testing in this setting. Finally, we carry out a simulation study in order to evaluate the performance of our procedure.
This paper is to prove the asymptotic normality of a statistic for detecting the existence of heteroscedasticity for linear regression models without assuming randomness of covariates when the sample size $n$ tends to infinity and the number of covariates $p$ is either fixed or tends to infinity. Moreover our approach indicates that its asymptotic normality holds even without homoscedasticity.
Structural changes occur in dynamic networks quite frequently and its detection is an important question in many situations such as fraud detection or cybersecurity. Real-life networks are often incompletely observed due to individual non-response or network size. In the present paper we consider the problem of change-point detection at a temporal sequence of partially observed networks. The goal is to test whether there is a change in the network parameters. Our approach is based on the Matrix CUSUM test statistic and allows growing size of networks. We show that the proposed test is minimax optimal and robust to missing links. We also demonstrate the good behavior of our approach in practice through simulation study and a real-data application.