ترغب بنشر مسار تعليمي؟ اضغط هنا

Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators

69   0   0.0 ( 0 )
 نشر من قبل Elisa Cabana Garceran Del Vall
 تاريخ النشر 2019
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

A collection of robust Mahalanobis distances for multivariate outlier detection is proposed, based on the notion of shrinkage. Robust intensity and scaling factors are optimally estimated to define the shrinkage. Some properties are investigated, such as affine equivariance and breakdown value. The performance of the proposal is illustrated through the comparison to other techniques from the literature, in a simulation study and with a real dataset. The behavior when the underlying distribution is heavy-tailed or skewed, shows the appropriateness of the method when we deviate from the common assumption of normality. The resulting high correct detection rates and low false detection rates in the vast majority of cases, as well as the significantly smaller computation time shows the advantages of our proposal.



قيم البحث

اقرأ أيضاً

A robust estimator is proposed for the parameters that characterize the linear regression problem. It is based on the notion of shrinkages, often used in Finance and previously studied for outlier detection in multivariate data. A thorough simulation study is conducted to investigate: the efficiency with normal and heavy-tailed errors, the robustness under contamination, the computational times, the affine equivariance and breakdown value of the regression estimator. Two classical data-sets often used in the literature and a real socio-economic data-set about the Living Environment Deprivation of areas in Liverpool (UK), are studied. The results from the simulations and the real data examples show the advantages of the proposed robust estimator in regression.
In high reliability standards fields such as automotive, avionics or aerospace, the detection of anomalies is crucial. An efficient methodology for automatically detecting multivariate outliers is introduced. It takes advantage of the remarkable prop erties of the Invariant Coordinate Selection (ICS) method. Based on the simultaneous spectral decomposition of two scatter matrices, ICS leads to an affine invariant coordinate system in which the Euclidian distance corresponds to a Mahalanobis Distance (MD) in the original coordinates. The limitations of MD are highlighted using theoretical arguments in a context where the dimension of the data is large. Unlike MD, ICS makes it possible to select relevant components which removes the limitations. Owing to the resulting dimension reduction, the method is expected to improve the power of outlier detection rules such as MD-based criteria. It also greatly simplifies outliers interpretation. The paper includes practical guidelines for using ICS in the context of a small proportion of outliers which is relevant in high reliability standards fields. The choice of scatter matrices together with the selection of relevant invariant components through parallel analysis and normality tests are addressed. The use of the regular covariance matrix and the so called matrix of fourth moments as the scatter pair is recommended. This choice combines the simplicity of implementation together with the possibility to derive theoretical results. A simulation study confirms the good properties of the proposal and compares it with other scatter pairs. This study also provides a comparison with Principal Component Analysis and MD. The performance of our proposal is also evaluated on several real data sets using a user-friendly R package accompanying the paper.
We consider the robust filtering problem for a nonlinear state-space model with outliers in measurements. To improve the robustness of the traditional Kalman filtering algorithm, we propose in this work two robust filters based on mixture correntropy , especially the double-Gaussian mixture correntropy and Laplace-Gaussian mixture correntropy. We have formulated the robust filtering problem by adopting the mixture correntropy induced cost to replace the quadratic one in the conventional Kalman filter for measurement fitting errors. In addition, a tradeoff weight coefficient is introduced to make sure the proposed approaches can provide reasonable state estimates in scenarios where measurement fitting errors are small. The formulated robust filtering problems are iteratively solved by utilizing the cubature Kalman filtering framework with a reweighted measurement covariance. Numerical results show that the proposed methods can achieve a performance improvement over existing robust solutions.
In a network meta-analysis, some of the collected studies may deviate markedly from the others, for example having very unusual effect sizes. These deviating studies can be regarded as outlying with respect to the rest of the network and can be influ ential on the pooled results. Thus, it could be inappropriate to synthesize those studies without further investigation. In this paper, we propose two Bayesian methods to detect outliers in a network meta-analysis via: (a) a mean-shifted outlier model and (b), posterior predictive p-values constructed from ad-hoc discrepancy measures. The former method uses Bayes factors to formally test each study against outliers while the latter provides a score of outlyingness for each study in the network, which allows to numerically quantify the uncertainty associated with being outlier. Furthermore, we present a simple method based on informative priors as part of the network meta-analysis model to down-weight the detected outliers. We conduct extensive simulations to evaluate the effectiveness of the proposed methodology while comparing it to some alternative, available outlier diagnostic tools. Two real networks of interventions are then used to demonstrate our methods in practice.
Smart metering infrastructures collect data almost continuously in the form of fine-grained long time series. These massive time series often have common daily patterns that are repeated between similar days or seasons and shared between grouped mete rs. Within this context, we propose a method to highlight individuals with abnormal daily dependency patterns, which we term evolution outliers. To this end, we approach the problem from the standpoint of Functional Data Analysis (FDA), by treating each daily record as a function or curve. We then focus on the morphological aspects of the observed curves, such as daily magnitude, daily shape, derivatives, and inter-day evolution. The proposed method for evolution outliers relies on the concept of functional depth, which has been a cornerstone in the literature of FDA to build shape and magnitude outlier detection methods. In conjunction with our evolution outlier proposal, these methods provide an outlier detection toolbox for smart meter data that covers a wide palette of functional outliers classes. We illustrate the outlier identification ability of this toolbox using actual smart metering data corresponding to photovoltaic energy generation and circuit voltage records.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا