No Arabic abstract
This paper presents minimax rates for density estimation when the data dimension $d$ is allowed to grow with the number of observations $n$ rather than remaining fixed as in previous analyses. We prove a non-asymptotic lower bound which gives the worst-case rate over standard classes of smooth densities, and we show that kernel density estimators achieve this rate. We also give oracle choices for the bandwidth and derive the fastest rate $d$ can grow with $n$ to maintain estimation consistency.
We study asymptotic minimax problems for estimating a $d$-dimensional regression parameter over spheres of growing dimension ($dto infty$). Assuming that the data follows a linear model with Gaussian predictors and errors, we show that ridge regression is asymptotically minimax and derive new closed form expressions for its asymptotic risk under squared-error loss. The asymptotic risk of ridge regression is closely related to the Stieltjes transform of the Marv{c}enko-Pastur distribution and the spectral distribution of the predictors from the linear model. Adaptive ridge estimators are also proposed (which adapt to the unknown radius of the sphere) and connections with equivariant estimation are highlighted. Our results are mostly relevant for asymptotic settings where the number of observations, $n$, is proportional to the number of predictors, that is, $d/ntorhoin(0,infty)$.
We address the problem of adaptive minimax density estimation on $bR^d$ with $bL_p$--loss on the anisotropic Nikolskii classes. We fully characterize behavior of the minimax risk for different relationships between regularity parameters and norm indexes in definitions of the functional class and of the risk. In particular, we show that there are four different regimes with respect to the behavior of the minimax risk. We develop a single estimator which is (nearly) optimal in orderover the complete scale of the anisotropic Nikolskii classes. Our estimation procedure is based on a data-driven selection of an estimator from a fixed family of kernel estimators.
We consider the problem of estimating the predictive density of future observations from a non-parametric regression model. The density estimators are evaluated under Kullback--Leibler divergence and our focus is on establishing the exact asymptotics of minimax risk in the case of Gaussian errors. We derive the convergence rate and constant for minimax risk among Bayesian predictive densities under Gaussian priors and we show that this minimax risk is asymptotically equivalent to that among all density estimators.
This paper studies the minimax rate of nonparametric conditional density estimation under a weighted absolute value loss function in a multivariate setting. We first demonstrate that conditional density estimation is impossible if one only requires that $p_{X|Z}$ is smooth in $x$ for all values of $z$. This motivates us to consider a sub-class of absolutely continuous distributions, restricting the conditional density $p_{X|Z}(x|z)$ to not only be Holder smooth in $x$, but also be total variation smooth in $z$. We propose a corresponding kernel-based estimator and prove that it achieves the minimax rate. We give some simple examples of densities satisfying our assumptions which imply that our results are not vacuous. Finally, we propose an estimator which achieves the minimax optimal rate adaptively, i.e., without the need to know the smoothness parameter values in advance. Crucially, both of our estimators (the adaptive and non-adaptive ones) impose no assumptions on the marginal density $p_Z$, and are not obtained as a ratio between two kernel smoothing estimators which may sound like a go to approach in this problem.
We present an elementary mathematical method to find the minimax estimator of the Bernoulli proportion $theta$ under the squared error loss when $theta$ belongs to the restricted parameter space of the form $Omega = [0, eta]$ for some pre-specified constant $0 leq eta leq 1$. This problem is inspired from the problem of estimating the rate of positive COVID-19 tests. The presented results and applications would be useful materials for both instructors and students when teaching point estimation in statistical or machine learning courses.