No Arabic abstract
We develop a distribution-free, unsupervised anomaly detection method called ECAD, which wraps around any regression algorithm and sequentially detects anomalies. Rooted in conformal prediction, ECAD does not require data exchangeability but approximately controls the Type-I error when data are normal. Computationally, it involves no data-splitting and efficiently trains ensemble predictors to increase statistical power. We demonstrate the superior performance of ECAD on detecting anomalous spatio-temporal traffic flow.
We propose Robust Lasso-Zero, an extension of the Lasso-Zero methodology [Descloux and Sardy, 2018], initially introduced for sparse linear models, to the sparse corruptions problem. We give theoretical guarantees on the sign recovery of the parameters for a slightly simplified version of the estimator, called Thresholded Justice Pursuit. The use of Robust Lasso-Zero is showcased for variable selection with missing values in the covariates. In addition to not requiring the specification of a model for the covariates, nor estimating their covariance matrix or the noise variance, the method has the great advantage of handling missing not-at random values without specifying a parametric model. Numerical experiments and a medical application underline the relevance of Robust Lasso-Zero in such a context with few available competitors. The method is easy to use and implemented in the R library lass0.
The algorithms used for optimal management of ambulances require accurate description and prediction of the spatio-temporal evolution of emergency interventions. In the last years, several authors have proposed sophisticated statistical approaches to forecast the ambulance dispatches, typically modelling the events as a point pattern occurring on a planar region. Nevertheless, ambulance interventions can be more appropriately modelled as a realisation of a point process occurring along a network of lines, such as a road network. The constrained spatial domain raises specific challenges and unique methodological problems that cannot be ignored when developing a proper statistical model. Hence, this paper proposes a spatiotemporal model to analyse the ambulance interventions that occurred in the road network of Milan (Italy) from 2015 to 2017. We adopt a non-separable first-order intensity function with spatial and temporal terms. The temporal component is estimated semi-parametrically using a Poisson regression model, while the spatial dimension is estimated nonparametrically using a network kernel function. A set of weights is included in the spatial term to capture space-time interactions, inducing non-separability in the intensity function. A series of maps and graphical tests show that our approach successfully models the ambulance interventions and captures the space-time patterns.
Background: All-in-one station-based health monitoring devices are implemented in elder homes in Hong Kong to support the monitoring of vital signs of the elderly. During a pilot study, it was discovered that the systolic blood pressure was incorrectly measured during multiple weeks. A real-time solution was needed to identify future data quality issues as soon as possible. Methods: Control charts are an effective tool for real-time monitoring and signaling issues (changes) in data. In this study, as in other healthcare applications, many observations are missing. Few methods are available for monitoring data with missing observations. A data quality monitoring method is developed to signal issues with the accuracy of the collected data quickly. This method has the ability to deal with missing observations. A Hotellings T-squared control chart is selected as the basis for our proposed method. Findings: The proposed method is retrospectively validated on a case study with a known measurement error in the systolic blood pressure measurements. The method is able to adequately detect this data quality problem. The proposed method was integrated into a personalized telehealth monitoring system and prospectively implemented in a second case study. It was found that the proposed scheme supports the control of data quality. Conclusions: Data quality is an important issue and control charts are useful for real-time monitoring of data quality. However, these charts must be adjusted to account for missing data that often occur in healthcare context.
Temporal anomaly detection looks for irregularities over space-time. Unsupervised temporal models employed thus far typically work on sequences of feature vectors, and much less on temporal multiway data. We focus our investigation on two-way data, in which a data matrix is observed at each time step. Leveraging recent advances in matrix-native recurrent neural networks, we investigated strategies for data arrangement and unsupervised training for temporal multiway anomaly detection. These include compressing-decompressing, encoding-predicting, and temporal data differencing. We conducted a comprehensive suite of experiments to evaluate model behaviors under various settings on synthetic data, moving digits, and ECG recordings. We found interesting phenomena not previously reported. These include the capacity of the compact matrix LSTM to compress noisy data near perfectly, making the strategy of compressing-decompressing data ill-suited for anomaly detection under the noise. Also, long sequence of vectors can be addressed directly by matrix models that allow very long context and multiple step prediction. Overall, the encoding-predicting strategy works very well for the matrix LSTMs in the conducted experiments, thanks to its compactness and better fit to the data dynamics.
Video anomaly detection has gained significant attention due to the increasing requirements of automatic monitoring for surveillance videos. Especially, the prediction based approach is one of the most studied methods to detect anomalies by predicting frames that include abnormal events in the test set after learning with the normal frames of the training set. However, a lot of prediction networks are computationally expensive owing to the use of pre-trained optical flow networks, or fail to detect abnormal situations because of their strong generative ability to predict even the anomalies. To address these shortcomings, we propose spatial rotation transformation (SRT) and temporal mixing transformation (TMT) to generate irregular patch cuboids within normal frame cuboids in order to enhance the learning of normal features. Additionally, the proposed patch transformation is used only during the training phase, allowing our model to detect abnormal frames at fast speed during inference. Our model is evaluated on three anomaly detection benchmarks, achieving competitive accuracy and surpassing all the previous works in terms of speed.