مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Optimal cross-validation in density estimation with the $L^2$-loss

153 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Alain Celisse

تاريخ النشر 2014

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Alain Celisse

نظرية الإحصاء نظرية الإحصاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We analyze the performance of cross-validation (CV) in the density estimation framework with two purposes: (i) risk estimation and (ii) model selection. The main focus is given to the so-called leave-$p$-out CV procedure (Lpo), where $p$ denotes the cardinality of the test set. Closed-form expressions are settled for the Lpo estimator of the risk of projection estimators. These expressions provide a great improvement upon $V$-fold cross-validation in terms of variability and computational complexity. From a theoretical point of view, closed-form expressions also enable to study the Lpo performance in terms of risk estimation. The optimality of leave-one-out (Loo), that is Lpo with $p=1$, is proved among CV procedures used for risk estimation. Two model selection frameworks are also considered: estimation, as opposed to identification. For estimation with finite sample size $n$, optimality is achieved for $p$ large enough [with $p/n=o(1)$] to balance the overfitting resulting from the structure of the model collection. For identification, model selection consistency is settled for Lpo as long as $p/n$ is conveniently related to the rate of convergence of the best estimator in the collection: (i) $p/nto1$ as $nto+infty$ with a parametric rate, and (ii) $p/n=o(1)$ with some nonparametric estimators. These theoretical results are validated by simulation experiments.

قيم البحث

74 - Guillaume Maillard 2021

In model selection, several types of cross-validation are commonly used and many variants have been introduced. While consistency of some of these methods has been proven, their rate of convergence to the oracle is generally still unknown. Until now, an asymptotic analysis of crossvalidation able to answer this question has been lacking. Existing results focus on the pointwise estimation of the risk of a single estimator, whereas analysing model selection requires understanding how the CV risk varies with the model. In this article, we investigate the asymptotics of the CV risk in the neighbourhood of the optimal model, for trigonometric series estimators in density estimation. Asymptotically, simple validation and incomplete V --fold CV behave like the sum of a convex function fn and a symmetrized Brownian changed in time W gn/V. We argue that this is the right asymptotic framework for studying model selection.

نظرية الإحصاء نظرية الإحصاء

Predictive density estimation under the Wasserstein loss

62 - Takeru Matsuda , William E. Strawderman 2019

We investigate predictive density estimation under the $L^2$ Wasserstein loss for location families and location-scale families. We show that plug-in densities form a complete class and that the Bayesian predictive density is given by the plug-in den sity with the posterior mean of the location and scale parameters. We provide Bayesian predictive densities that dominate the best equivariant one in normal models.

نظرية الإحصاء نظرية الإحصاء

Minimax Optimal Conditional Density Estimation under Total Variation Smoothness

98 - Michael Li , Matey Neykov , Sivaraman Balakrishnan 2021

This paper studies the minimax rate of nonparametric conditional density estimation under a weighted absolute value loss function in a multivariate setting. We first demonstrate that conditional density estimation is impossible if one only requires t hat $p_{X|Z}$ is smooth in $x$ for all values of $z$. This motivates us to consider a sub-class of absolutely continuous distributions, restricting the conditional density $p_{X|Z}(x|z)$ to not only be Holder smooth in $x$, but also be total variation smooth in $z$. We propose a corresponding kernel-based estimator and prove that it achieves the minimax rate. We give some simple examples of densities satisfying our assumptions which imply that our results are not vacuous. Finally, we propose an estimator which achieves the minimax optimal rate adaptively, i.e., without the need to know the smoothness parameter values in advance. Crucially, both of our estimators (the adaptive and non-adaptive ones) impose no assumptions on the marginal density $p_Z$, and are not obtained as a ratio between two kernel smoothing estimators which may sound like a go to approach in this problem.

نظرية الإحصاء نظرية الإحصاء

Shrinkage estimation with a matrix loss function

185 - Reman Abu-Shanab , John T. Kent , William E. Strawderman 2011

Consider estimating the n by p matrix of means of an n by p matrix of independent normally distributed observations with constant variance, where the performance of an estimator is judged using a p by p matrix quadratic error loss function. A matrix version of the James-Stein estimator is proposed, depending on a tuning constant. It is shown to dominate the usual maximum likelihood estimator for some choices of of the tuning constant when n is greater than or equal to 3. This result also extends to other shrinkage estimators and settings.

نظرية الإحصاء نظرية الإحصاء

Optimal convergence rates for the invariant density estimation of jump-diffusion processes

207 - Chiara Amorino , Eulalia Nualart 2021

We aim at estimating the invariant density associated to a stochastic differential equation with jumps in low dimension, which is for $d=1$ and $d=2$. We consider a class of jump diffusion processes whose invariant density belongs to some Holder spac e. Firstly, in dimension one, we show that the kernel density estimator achieves the convergence rate $frac{1}{T}$, which is the optimal rate in the absence of jumps. This improves the convergence rate obtained in [Amorino, Gloter (2021)], which depends on the Blumenthal-Getoor index for $d=1$ and is equal to $frac{log T}{T}$ for $d=2$. Secondly, we show that is not possible to find an estimator with faster rates of estimation. Indeed, we get some lower bounds with the same rates ${frac{1}{T},frac{log T}{T}}$ in the mono and bi-dimensional cases, respectively. Finally, we obtain the asymptotic normality of the estimator in the one-dimensional case.

نظرية الإحصاء نظرية الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة طرطوس

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Optimal cross-validation in density estimation with the $L^2$-loss

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً