No Arabic abstract
Vector Auto-Regressive (VAR) models capture lead-lag temporal dynamics of multivariate time series data. They have been widely used in macroeconomics, financial econometrics, neuroscience and functional genomics. In many applications, the data exhibit structural changes in their autoregressive dynamics, which correspond to changes in the transition matrices of the VAR model that specify such dynamics. We present the R package VARDetect that implements two classes of algorithms to detect multiple change points in piecewise stationary VAR models. The first exhibits sublinear computational complexity in the number of time points and is best suited for structured sparse models, while the second exhibits linear time complexity and is designed for models whose transition matrices are assumed to have a low rank plus sparse decomposition. The package also has functions to generate data from the various variants of VAR models discussed, which is useful in simulation studies, as well as to visualize the results through network layouts.
Several statistical approaches based on reproducing kernels have been proposed to detect abrupt changes arising in the full distribution of the observations and not only in the mean or variance. Some of these approaches enjoy good statistical properties (oracle inequality, ldots). Nonetheless, they have a high computational cost both in terms of time and memory. This makes their application difficult even for small and medium sample sizes ($n< 10^4$). This computational issue is addressed by first describing a new efficient and exact algorithm for kernel multiple change-point detection with an improved worst-case complexity that is quadratic in time and linear in space. It allows dealing with medium size signals (up to $n approx 10^5$). Second, a faster but approximation algorithm is described. It is based on a low-rank approximation to the Gram matrix. It is linear in time and space. This approximation algorithm can be applied to large-scale signals ($n geq 10^6$). These exact and approximation algorithms have been implemented in texttt{R} and texttt{C} for various kernels. The computational and statistical performances of these new algorithms have been assessed through empirical experiments. The runtime of the new algorithms is observed to be faster than that of other considered procedures. Finally, simulations confirmed the higher statistical accuracy of kernel-based approaches to detect changes that are not only in the mean. These simulations also illustrate the flexibility of kernel-based approaches to analyze complex biological profiles made of DNA copy number and allele B frequencies. An R package implementing the approach will be made available on github.
The multivariate Bayesian structural time series (MBSTS) model citep{qiu2018multivariate,Jammalamadaka2019Predicting} as a generalized version of many structural time series models, deals with inference and prediction for multiple correlated time series, where one also has the choice of using a different candidate pool of contemporaneous predictors for each target series. The MBSTS model has wide applications and is ideal for feature selection, time series forecasting, nowcasting, inferring causal impact, and others. This paper demonstrates how to use the R package pkg{mbsts} for MBSTS modeling, establishing a bridge between user-friendly and developer-friendly functions in package and the corresponding methodology. A simulated dataset and object-oriented functions in the pkg{mbsts} package are explained in the way that enables users to flexibly add or deduct some components, as well as to simplify or complicate some settings.
We consider the detection and localization of change points in the distribution of an offline sequence of observations. Based on a nonparametric framework that uses a similarity graph among observations, we propose new test statistics when at most one change point occurs and generalize them to multiple change points settings. The proposed statistics leverage edge weight information in the graphs, exhibiting substantial improvements in testing power and localization accuracy in simulations. We derive the null limiting distribution, provide accurate analytic approximations to control type I error, and establish theoretical guarantees on the power consistency under contiguous alternatives for the one change point setting, as well as the minimax localization rate. In the multiple change points setting, the asymptotic correctness of the number and location of change points are also guaranteed. The methods are illustrated on the MIT proximity network data.
Structural breaks have been commonly seen in applications. Specifically for detection of change points in time, research gap still remains on the setting in ultra high dimension, where the covariates may bear spurious correlations. In this paper, we propose a two-stage approach to detect change points in ultra high dimension, by firstly proposing the dynamic titled current correlation screening method to reduce the input dimension, and then detecting possible change points in the framework of group variable selection. Not only the spurious correlation between ultra-high dimensional covariates is taken into consideration in variable screening, but non-convex penalties are studied in change point detection in the ultra high dimension. Asymptotic properties are derived to guarantee the asymptotic consistency of the selection procedure, and the numerical investigations show the promising performance of the proposed approach.
The variance of noise plays an important role in many change-point detection procedures and the associated inferences. Most commonly used variance estimators require strong assumptions on the true mean structure or normality of the error distribution, which may not hold in applications. More importantly, the qualities of these estimators have not been discussed systematically in the literature. In this paper, we introduce a framework of equivariant variance estimation for multiple change-point models. In particular, we characterize the set of all equivariant unbiased quadratic variance estimators for a family of change-point model classes, and develop a minimax theory for such estimators.