No Arabic abstract
Let $X$ be a centered Gaussian random variable in a separable Hilbert space ${mathbb H}$ with covariance operator $Sigma.$ We study a problem of estimation of a smooth functional of $Sigma$ based on a sample $X_1,dots ,X_n$ of $n$ independent observations of $X.$ More specifically, we are interested in functionals of the form $langle f(Sigma), Brangle,$ where $f:{mathbb R}mapsto {mathbb R}$ is a smooth function and $B$ is a nuclear operator in ${mathbb H}.$ We prove concentration and normal approximation bounds for plug-in estimator $langle f(hat Sigma),Brangle,$ $hat Sigma:=n^{-1}sum_{j=1}^n X_jotimes X_j$ being the sample covariance based on $X_1,dots, X_n.$ These bounds show that $langle f(hat Sigma),Brangle$ is an asymptotically normal estimator of its expectation ${mathbb E}_{Sigma} langle f(hat Sigma),Brangle$ (rather than of parameter of interest $langle f(Sigma),Brangle$) with a parametric convergence rate $O(n^{-1/2})$ provided that the effective rank ${bf r}(Sigma):= frac{{bf tr}(Sigma)}{|Sigma|}$ (${rm tr}(Sigma)$ being the trace and $|Sigma|$ being the operator norm of $Sigma$) satisfies the assumption ${bf r}(Sigma)=o(n).$ At the same time, we show that the bias of this estimator is typically as large as $frac{{bf r}(Sigma)}{n}$ (which is larger than $n^{-1/2}$ if ${bf r}(Sigma)geq n^{1/2}$). In the case when ${mathbb H}$ is finite-dimensional space of dimension $d=o(n),$ we develop a method of bias reduction and construct an estimator $langle h(hat Sigma),Brangle$ of $langle f(Sigma),Brangle$ that is asymptotically normal with convergence rate $O(n^{-1/2}).$ Moreover, we study asymptotic properties of the risk of this estimator and prove minimax lower bounds for arbitrary estimators showing the asymptotic efficiency of $langle h(hat Sigma),Brangle$ in a semi-parametric sense.
We study a problem of estimation of smooth functionals of parameter $theta $ of Gaussian shift model $$ X=theta +xi, theta in E, $$ where $E$ is a separable Banach space and $X$ is an observation of unknown vector $theta$ in Gaussian noise $xi$ with zero mean and known covariance operator $Sigma.$ In particular, we develop estimators $T(X)$ of $f(theta)$ for functionals $f:Emapsto {mathbb R}$ of Holder smoothness $s>0$ such that $$ sup_{|theta|leq 1} {mathbb E}_{theta}(T(X)-f(theta))^2 lesssim Bigl(|Sigma| vee ({mathbb E}|xi|^2)^sBigr)wedge 1, $$ where $|Sigma|$ is the operator norm of $Sigma,$ and show that this mean squared error rate is minimax optimal at least in the case of standard Gaussian shift model ($E={mathbb R}^d$ equipped with the canonical Euclidean norm, $xi =sigma Z,$ $Zsim {mathcal N}(0;I_d)$). Moreover, we determine a sharp threshold on the smoothness $s$ of functional $f$ such that, for all $s$ above the threshold, $f(theta)$ can be estimated efficiently with a mean squared error rate of the order $|Sigma|$ in a small noise setting (that is, when ${mathbb E}|xi|^2$ is small). The construction of efficient estimators is crucially based on a bootstrap chain method of bias reduction. The results could be applied to a variety of special high-dimensional and infinite-dimensional Gaussian models (for vector, matrix and functional data).
We study principal component analysis (PCA) for mean zero i.i.d. Gaussian observations $X_1,dots, X_n$ in a separable Hilbert space $mathbb{H}$ with unknown covariance operator $Sigma.$ The complexity of the problem is characterized by its effective rank ${bf r}(Sigma):= frac{{rm tr}(Sigma)}{|Sigma|},$ where ${rm tr}(Sigma)$ denotes the trace of $Sigma$ and $|Sigma|$ denotes its operator norm. We develop a method of bias reduction in the problem of estimation of linear functionals of eigenvectors of $Sigma.$ Under the assumption that ${bf r}(Sigma)=o(n),$ we establish the asymptotic normality and asymptotic properties of the risk of the resulting estimators and prove matching minimax lower bounds, showing their semi-parametric optimality.
Let $X_1,dots, X_n$ be i.i.d. random variables sampled from a normal distribution $N(mu,Sigma)$ in ${mathbb R}^d$ with unknown parameter $theta=(mu,Sigma)in Theta:={mathbb R}^dtimes {mathcal C}_+^d,$ where ${mathcal C}_+^d$ is the cone of positively definite covariance operators in ${mathbb R}^d.$ Given a smooth functional $f:Theta mapsto {mathbb R}^1,$ the goal is to estimate $f(theta)$ based on $X_1,dots, X_n.$ Let $$ Theta(a;d):={mathbb R}^dtimes Bigl{Sigmain {mathcal C}_+^d: sigma(Sigma)subset [1/a, a]Bigr}, ageq 1, $$ where $sigma(Sigma)$ is the spectrum of covariance $Sigma.$ Let $hat theta:=(hat mu, hat Sigma),$ where $hat mu$ is the sample mean and $hat Sigma$ is the sample covariance, based on the observations $X_1,dots, X_n.$ For an arbitrary functional $fin C^s(Theta),$ $s=k+1+rho, kgeq 0, rhoin (0,1],$ we define a functional $f_k:Theta mapsto {mathbb R}$ such that begin{align*} & sup_{thetain Theta(a;d)}|f_k(hat theta)-f(theta)|_{L_2({mathbb P}_{theta})} lesssim_{s, beta} |f|_{C^{s}(Theta)} biggr[biggl(frac{a}{sqrt{n}} bigvee a^{beta s}biggl(sqrt{frac{d}{n}}biggr)^{s} biggr)wedge 1biggr], end{align*} where $beta =1$ for $k=0$ and $beta>s-1$ is arbitrary for $kgeq 1.$ This error rate is minimax optimal and similar bounds hold for more general loss functions. If $d=d_nleq n^{alpha}$ for some $alphain (0,1)$ and $sgeq frac{1}{1-alpha},$ the rate becomes $O(n^{-1/2}).$ Moreover, for $s>frac{1}{1-alpha},$ the estimators $f_k(hat theta)$ is shown to be asymptotically efficient. The crucial part of the construction of estimator $f_k(hat theta)$ is a bias reduction method studied in the paper for more general statistical models than normal.
Let $X^{(n)}$ be an observation sampled from a distribution $P_{theta}^{(n)}$ with an unknown parameter $theta,$ $theta$ being a vector in a Banach space $E$ (most often, a high-dimensional space of dimension $d$). We study the problem of estimation of $f(theta)$ for a functional $f:Emapsto {mathbb R}$ of some smoothness $s>0$ based on an observation $X^{(n)}sim P_{theta}^{(n)}.$ Assuming that there exists an estimator $hat theta_n=hat theta_n(X^{(n)})$ of parameter $theta$ such that $sqrt{n}(hat theta_n-theta)$ is sufficiently close in distribution to a mean zero Gaussian random vector in $E,$ we construct a functional $g:Emapsto {mathbb R}$ such that $g(hat theta_n)$ is an asymptotically normal estimator of $f(theta)$ with $sqrt{n}$ rate provided that $s>frac{1}{1-alpha}$ and $dleq n^{alpha}$ for some $alphain (0,1).$ We also derive general upper bounds on Orlicz norm error rates for estimator $g(hat theta)$ depending on smoothness $s,$ dimension $d,$ sample size $n$ and the accuracy of normal approximation of $sqrt{n}(hat theta_n-theta).$ In particular, this approach yields asymptotically efficient estimators in some high-dimensional exponential models.
For testing hypothesis on the covariance operator of functional time series, we suggest to use the full functional information and to avoid dimension reduction techniques. The limit distribution follows from the central limit theorem of the weak convergence of the partial sum process in general Hilbert space applied to the product space. In order to obtain critical values for tests, we generalize bootstrap results from the independent to the dependent case. This results can be applied to covariance operators, autocovariance operators and cross covariance operators. We discuss one sample and changepoint tests and give some simulation results.