ترغب بنشر مسار تعليمي؟ اضغط هنا

Robust low-rank covariance matrix estimation with a general pattern of missing values

132   0   0.0 ( 0 )
 نشر من قبل Alexandre Hippert-Ferrer
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper tackles the problem of robust covariance matrix estimation when the data is incomplete. Classical statistical estimation methodologies are usually built upon the Gaussian assumption, whereas existing robust estimation ones assume unstructured signal models. The former can be inaccurate in real-world data sets in which heterogeneity causes heavy-tail distributions, while the latter does not profit from the usual low-rank structure of the signal. Taking advantage of both worlds, a covariance matrix estimation procedure is designed on a robust (compound Gaussian) low-rank model by leveraging the observed-data likelihood function within an expectation-maximization algorithm. It is also designed to handle general pattern of missing values. The proposed procedure is first validated on simulated data sets. Then, its interest for classification and clustering applications is assessed on two real data sets with missing values, which include multispectral and hyperspectral time series.



قيم البحث

اقرأ أيضاً

We consider the problem of estimating high-dimensional covariance matrices of a particular structure, which is a summation of low rank and sparse matrices. This covariance structure has a wide range of applications including factor analysis and rando m effects models. We propose a Bayesian method of estimating the covariance matrices by representing the covariance model in the form of a factor model with unknown number of latent factors. We introduce binary indicators for factor selection and rank estimation for the low rank component combined with a Bayesian lasso method for the sparse component estimation. Simulation studies show that our method can recover the rank as well as the sparsity of the two components respectively. We further extend our method to a graphical factor model where the graphical model of the residuals as well as selecting the number of factors is of interest. We employ a hyper-inverse Wishart prior for modeling decomposable graphs of the residuals, and a Bayesian graphical lasso selection method for unrestricted graphs. We show through simulations that the extended models can recover both the number of latent factors and the graphical model of the residuals successfully when the sample size is sufficient relative to the dimension.
This paper proposed a low-complexity antenna layout-aware (ALA) covariance matrix estimation method. In the estimation process, antenna layout is assumed known at the estimator. Using this information, the estimator finds antenna pairs with statistic ally equivalent covariance values and sets their covariance values to the average of covariance values of all these antenna pairs. ALA for both uniform linear array (ULA) and uniform planar array (UPA) is discussed. This method takes the benefit that covariance matrices do not have full degrees of freedom. Then, the proposed ALA covariance matrix method is applied to a multi-cell network. Simulations have demonstrated that the proposed method can provide better performance than the widely used viaQ method, with respect to mean square errors and downlink spectral efficiencies.
We consider the problem of estimating a low rank covariance function $K(t,u)$ of a Gaussian process $S(t), tin [0,1]$ based on $n$ i.i.d. copies of $S$ observed in a white noise. We suggest a new estimation procedure adapting simultaneously to the lo w rank structure and the smoothness of the covariance function. The new procedure is based on nuclear norm penalization and exhibits superior performances as compared to the sample covariance function by a polynomial factor in the sample size $n$. Other results include a minimax lower bound for estimation of low-rank covariance functions showing that our procedure is optimal as well as a scheme to estimate the unknown noise variance of the Gaussian process.
Missing attributes are ubiquitous in causal inference, as they are in most applied statistical work. In this paper, we consider various sets of assumptions under which causal inference is possible despite missing attributes and discuss corresponding approaches to average treatment effect estimation, including generalized propensity score methods and multiple imputation. Across an extensive simulation study, we show that no single method systematically out-performs others. We find, however, that doubly robust modifications of standard methods for average treatment effect estimation with missing data repeatedly perform better than their non-doubly robust baselines; for example, doubly robust generalized propensity score methods beat inverse-weighting with the generalized propensity score. This finding is reinforced in an analysis of an observations study on the effect on mortality of tranexamic acid administration among patients with traumatic brain injury in the context of critical care management. Here, doubly robust estimators recover confidence intervals that are consistent with evidence from randomized trials, whereas non-doubly robust estimators do not.
In this paper we study covariance estimation with missing data. We consider missing data mechanisms that can be independent of the data, or have a time varying dependency. Additionally, observed variables may have arbitrary (non uniform) and dependen t observation probabilities. For each mechanism, we construct an unbiased estimator and obtain bounds for the expected value of their estimation error in operator norm. Our bounds are equivalent, up to constant and logarithmic factors, to state of the art bounds for complete and uniform missing observations. Furthermore, for the more general non uniform and dependent cases, the proposed bounds are new or improve upon previous results. Our error estimates depend on quantities we call scaled effective rank, which generalize the effective rank to account for missing observations. All the estimators studied in this work have the same asymptotic convergence rate (up to logarithmic factors).
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا