Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Optimal spectral shrinkage and PCA with heteroscedastic noise

67 0 0.0 ( 0 )

Download Cite

Added by William Leeb

Publication date 2018

fields Mathematical Statistics

and research's language is English

Authors William Leeb - Elad Romanov

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper studies the related problems of prediction, covariance estimation, and principal component analysis for the spiked covariance model with heteroscedastic noise. We consider an estimator of the principal components based on whitening the noise, and we derive optimal singular value and eigenvalue shrinkers for use with these estimated principal components. Underlying these methods are new asymptotic results for the high-dimensional spiked model with heteroscedastic noise, and consistent estimators for the relevant population parameters. We extend previous analysis on out-of-sample prediction to the setting of predictors with whitening. We demonstrate certain advantages of noise whitening. Specifically, we show that in a certain asymptotic regime, optimal singular value shrinkage with whitening converges to the best linear predictor, whereas without whitening it converges to a suboptimal linear predictor. We prove that for generic signals, whitening improves estimation of the principal components, and increases a natural signal-to-noise ratio of the observations. We also show that for rank one signals, our estimated principal components achieve the asymptotic minimax rate.

rate research

A Note on Taylors Expansion and Mean Value Theorem With Respect to a Random Variable

66 - Yifan Yang , Xiaoyu Zhou 2021

We introduce a stochastic version of Taylors expansion and Mean Value Theorem, originally proved by Aliprantis and Border (1999), and extend them to a multivariate case. For a univariate case, the theorem asserts that suppose a real-valued function $f$ has a continuous derivative $f$ on a closed interval $I$ and $X$ is a random variable on a probability space $(Omega, mathcal{F}, P)$. Fix $a in I$, there exists a textit{random variable} $xi$ such that $xi(omega) in I$ for every $omega in Omega$ and $f(X(omega)) = f(a) + f(xi(omega))(X(omega) - a)$. The proof is not trivial. By applying these results in statistics, one may simplify some details in the proofs of the Delta method or the asymptotic properties for a maximum likelihood estimator. In particular, when mentioning there exists $theta ^ *$ between $hat{theta}$ (a maximum likelihood estimator) and $theta_0$ (the true value), a stochastic version of Mean Value Theorem guarantees $theta ^ *$ is a random variable (or a random vector).

Other Statistics Statistics Theory Statistics Theory

Bayesian Variable Selection for Skewed Heteroscedastic Response

132 - Libo Wang , Yuanyuan Tang , Debajyoti Sinha 2016

In this article, we propose new Bayesian methods for selecting and estimating a sparse coefficient vector for skewed heteroscedastic response. Our novel Bayesian procedures effectively estimate the median and other quantile functions, accommodate non-local prior for regression effects without compromising ease of implementation via sampling based tools, and asymptotically select the true set of predictors even when the number of covariates increases in the same order of the sample size. We also extend our method to deal with some observations with very large errors. Via simulation studies and a re-analysis of a medical cost study with large number of potential predictors, we illustrate the ease of implementation and other practical advantages of our approach compared to existing methods for such studies.

Methodology Statistics Theory Statistics Theory

Segmentation of the mean of heteroscedastic data via cross-validation

484 - Sylvain Arlot , Alain Celisse 2009

This paper tackles the problem of detecting abrupt changes in the mean of a heteroscedastic signal by model selection, without knowledge on the variations of the noise. A new family of change-point detection procedures is proposed, showing that cross-validation methods can be successful in the heteroscedastic framework, whereas most existing procedures are not robust to heteroscedasticity. The robustness to heteroscedasticity of the proposed procedures is supported by an extensive simulation study, together with recent theoretical results. An application to Comparative Genomic Hybridization (CGH) data is provided, showing that robustness to heteroscedasticity can indeed be required for their analysis.

Methodology Statistics Theory Statistics Theory

Honest confidence sets for high-dimensional regression by projection and shrinkage

124 - Kun Zhou , Ker-Chau Li , 2019

The issue of honesty in constructing confidence sets arises in nonparametric regression. While optimal rate in nonparametric estimation can be achieved and utilized to construct sharp confidence sets, severe degradation of confidence level often happens after estimating the degree of smoothness. Similarly, for high-dimensional regression, oracle inequalities for sparse estimators could be utilized to construct sharp confidence sets. Yet the degree of sparsity itself is unknown and needs to be estimated, causing the honesty problem. To resolve this issue, we develop a novel method to construct honest confidence sets for sparse high-dimensional linear regression. The key idea in our construction is to separate signals into a strong and a weak group, and then construct confidence sets for each group separately. This is achieved by a projection and shrinkage approach, the latter implemented via Stein estimation and the associated Stein unbiased risk estimate. Our confidence set is honest over the full parameter space without any sparsity constraints, while its diameter adapts to the optimal rate of $n^{-1/4}$ when the true parameter is indeed sparse. Through extensive numerical comparisons, we demonstrate that our method outperforms other competitors with big margins for finite samples, including oracle methods built upon the true sparsity of the underlying model.

Methodology Statistics Theory Statistics Theory

Scalable SUM-Shrinkage Schemes for Distributed Monitoring Large-Scale Data Streams

97 - Kun Liu , Ruizhi Zhang , 2016

In this article, motivated by biosurveillance and censoring sensor networks, we investigate the problem of distributed monitoring large-scale data streams where an undesired event may occur at some unknown time and affect only a few unknown data streams. We propose to develop scalable global monitoring schemes by parallel running local detection procedures and by combining these local procedures together to make a global decision based on SUM-shrinkage techniques. Our approach is illustrated in two concrete examples: one is the nonhomogeneous case when the pre-change and post-change local distributions are given, and the other is the homogeneous case of monitoring a large number of independent $N(0,1)$ data streams where the means of some data streams might shift to unknown positive or negative values. Numerical simulation studies demonstrate the usefulness of the proposed schemes.

Methodology Statistics Theory Statistics Theory

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Optimal spectral shrinkage and PCA with heteroscedastic noise

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions