No Arabic abstract
This paper studies the minimax rate of nonparametric conditional density estimation under a weighted absolute value loss function in a multivariate setting. We first demonstrate that conditional density estimation is impossible if one only requires that $p_{X|Z}$ is smooth in $x$ for all values of $z$. This motivates us to consider a sub-class of absolutely continuous distributions, restricting the conditional density $p_{X|Z}(x|z)$ to not only be Holder smooth in $x$, but also be total variation smooth in $z$. We propose a corresponding kernel-based estimator and prove that it achieves the minimax rate. We give some simple examples of densities satisfying our assumptions which imply that our results are not vacuous. Finally, we propose an estimator which achieves the minimax optimal rate adaptively, i.e., without the need to know the smoothness parameter values in advance. Crucially, both of our estimators (the adaptive and non-adaptive ones) impose no assumptions on the marginal density $p_Z$, and are not obtained as a ratio between two kernel smoothing estimators which may sound like a go to approach in this problem.
We consider the problem of conditional independence testing of $X$ and $Y$ given $Z$ where $X,Y$ and $Z$ are three real random variables and $Z$ is continuous. We focus on two main cases - when $X$ and $Y$ are both discrete, and when $X$ and $Y$ are both continuous. In view of recent results on conditional independence testing (Shah and Peters, 2018), one cannot hope to design non-trivial tests, which control the type I error for all absolutely continuous conditionally independent distributions, while still ensuring power against interesting alternatives. Consequently, we identify various, natural smoothness assumptions on the conditional distributions of $X,Y|Z=z$ as $z$ varies in the support of $Z$, and study the hardness of conditional independence testing under these smoothness assumptions. We derive matching lower and upper bounds on the critical radius of separation between the null and alternative hypotheses in the total variation metric. The tests we consider are easily implementable and rely on binning the support of the continuous variable $Z$. To complement these results, we provide a new proof of the hardness result of Shah and Peters.
This paper presents minimax rates for density estimation when the data dimension $d$ is allowed to grow with the number of observations $n$ rather than remaining fixed as in previous analyses. We prove a non-asymptotic lower bound which gives the worst-case rate over standard classes of smooth densities, and we show that kernel density estimators achieve this rate. We also give oracle choices for the bandwidth and derive the fastest rate $d$ can grow with $n$ to maintain estimation consistency.
We study minimax density estimation on the product space $mathbb{R}^{d_1}timesmathbb{R}^{d_2}$. We consider $L^p$-risk for probability density functions defined over regularity spaces that allow for different level of smoothness in each of the variables. Precisely, we study probabilities on Sobolev spaces with dominating mixed-smoothness. We provide the rate of convergence that is optimal even for the classical Sobolev spaces.
We address the problem of adaptive minimax density estimation on $bR^d$ with $bL_p$--loss on the anisotropic Nikolskii classes. We fully characterize behavior of the minimax risk for different relationships between regularity parameters and norm indexes in definitions of the functional class and of the risk. In particular, we show that there are four different regimes with respect to the behavior of the minimax risk. We develop a single estimator which is (nearly) optimal in orderover the complete scale of the anisotropic Nikolskii classes. Our estimation procedure is based on a data-driven selection of an estimator from a fixed family of kernel estimators.
In this paper we consider the problem of estimating $f$, the conditional density of $Y$ given $X$, by using an independent sample distributed as $(X,Y)$ in the multivariate setting. We consider the estimation of $f(x,.)$ where $x$ is a fixed point. We define two different procedures of estimation, the first one using kernel rules, the second one inspired from projection methods. Both adapted estimators are tuned by using the Goldenshluger and Lepski methodology. After deriving lower bounds, we show that these procedures satisfy oracle inequalities and are optimal from the minimax point of view on anisotropic H{o}lder balls. Furthermore, our results allow us to measure precisely the influence of $mathrm{f}_X(x)$ on rates of convergence, where $mathrm{f}_X$ is the density of $X$. Finally, some simulations illustrate the good behavior of our tuned estimates in practice.