بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

A Computationally Efficient Classification Algorithm in Posterior Drift Model: Phase Transition and Minimax Adaptivity

289 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ruiqi Liu

تاريخ النشر 2020

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Ruiqi Liu - Kexuan Li - Zuofeng Shang

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In massive data analysis, training and testing data often come from very different sources, and their probability distributions are not necessarily identical. A feature example is nonparametric classification in posterior drift model where the conditional distributions of the label given the covariates are possibly different. In this paper, we derive minimax rate of the excess risk for nonparametric classification in posterior drift model in the setting that both training and testing data have smooth distributions, extending a recent work by Cai and Wei (2019) who only impose smoothness condition on the distribution of testing data. The minimax rate demonstrates a phase transition characterized by the mutual relationship between the smoothness orders of the training and testing data distributions. We also propose a computationally efficient and data-driven nearest neighbor classifier which achieves the minimax excess risk (up to a logarithm factor). Simulation studies and a real-world application are conducted to demonstrate our approach.

قيم البحث

90 - Kyoungjae Lee , Seongil Jo , Jaeyong Lee 2021

Statistical inference for sparse covariance matrices is crucial to reveal dependence structure of large multivariate data sets, but lacks scalable and theoretically supported Bayesian methods. In this paper, we propose beta-mixture shrinkage prior, c omputationally more efficient than the spike and slab prior, for sparse covariance matrices and establish its minimax optimality in high-dimensional settings. The proposed prior consists of beta-mixture shrinkage and gamma priors for off-diagonal and diagonal entries, respectively. To ensure positive definiteness of the resulting covariance matrix, we further restrict the support of the prior to a subspace of positive definite matrices. We obtain the posterior convergence rate of the induced posterior under the Frobenius norm and establish a minimax lower bound for sparse covariance matrices. The class of sparse covariance matrices for the minimax lower bound considered in this paper is controlled by the number of nonzero off-diagonal elements and has more intuitive appeal than those appeared in the literature. The obtained posterior convergence rate coincides with the minimax lower bound unless the true covariance matrix is extremely sparse. In the simulation study, we show that the proposed method is computationally more efficient than competitors, while achieving comparable performance. Advantages of the shrinkage prior are demonstrated based on two real data sets.

نظرية الإحصاء المنهجية نظرية الإحصاء

Gibbs posterior concentration rates under sub-exponential type losses

86 - Nicholas Syring , Ryan Martin 2020

Bayesian posterior distributions are widely used for inference, but their dependence on a statistical model creates some challenges. In particular, there may be lots of nuisance parameters that require prior distributions and posterior computations, plus a potentially serious risk of model misspecification bias. Gibbs posterior distributions, on the other hand, offer direct, principled, probabilistic inference on quantities of interest through a loss function, not a model-based likelihood. Here we provide simple sufficient conditions for establishing Gibbs posterior concentration rates when the loss function is of a sub-exponential type. We apply these general results in a range of practically relevant examples, including mean regression, quantile regression, and sparse high-dimensional classification. We also apply these techniques in an important problem in medical statistics, namely, estimation of a personalized minimum clinically important difference.

نظرية الإحصاء المنهجية نظرية الإحصاء

Minimax Estimation of Large Precision Matrices with Bandable Cholesky Factor

89 - Yu Liu , Zhao Ren 2017

Last decade witnesses significant methodological and theoretical advances in estimating large precision matrices. In particular, there are scientific applications such as longitudinal data, meteorology and spectroscopy in which the ordering of the va riables can be interpreted through a bandable structure on the Cholesky factor of the precision matrix. However, the minimax theory has still been largely unknown, as opposed to the well established minimax results over the corresponding bandable covariance matrices. In this paper, we focus on two commonly used types of parameter spaces, and develop the optimal rates of convergence under both the operator norm and the Frobenius norm. A striking phenomenon is found: two types of parameter spaces are fundamentally different under the operator norm but enjoy the same rate optimality under the Frobenius norm, which is in sharp contrast to the equivalence of corresponding two types of bandable covariance matrices under both norms. This fundamental difference is established by carefully constructing the corresponding minimax lower bounds. Two new estimation procedures are developed: for the operator norm, our optimal procedure is based on a novel local cropping estimator targeting on all principle submatrices of the precision matrix while for the Frobenius norm, our optimal procedure relies on a delicate regression-based thresholding rule. Lepskis method is considered to achieve optimal adaptation. We further establish rate optimality in the nonparanormal model. Numerical studies are carried out to confirm our theoretical findings.

نظرية الإحصاء المنهجية نظرية الإحصاء

A provable two-stage algorithm for penalized hazards regression

98 - Jianqing Fan , Wenyan Gong , Qiang Sun 2021

From an optimizers perspective, achieving the global optimum for a general nonconvex problem is often provably NP-hard using the classical worst-case analysis. In the case of Coxs proportional hazards model, by taking its statistical model structures into account, we identify local strong convexity near the global optimum, motivated by which we propose to use two convex programs to optimize the folded-concave penalized Coxs proportional hazards regression. Theoretically, we investigate the statistical and computational tradeoffs of the proposed algorithm and establish the strong oracle property of the resulting estimators. Numerical studies and real data analysis lend further support to our algorithm and theory.

نظرية الإحصاء المنهجية نظرية الإحصاء

On loss functions and regret bounds for multi-category classification

105 - Zhiqiang Tan , Xinwei Zhang 2020

We develop new approaches in multi-class settings for constructing proper scoring rules and hinge-like losses and establishing corresponding regret bounds with respect to the zero-one or cost-weighted classification loss. Our construction of losses i nvolves deriving new inverse mappings from a concave generalized entropy to a loss through the use of a convex dissimilarity function related to the multi-distribution $f$-divergence. Moreover, we identify new classes of multi-class proper scoring rules, which also recover and reveal interesting relationships between various composite losses currently in use. We establish new classification regret bounds in general for multi-class proper scoring rules by exploiting the Bregman divergences of the associated generalized entropies, and, as applications, provide simple meaningful regret bounds for two specific classes of proper scoring rules. Finally, we derive new hinge-like convex losses, which are tighter convex extensions than related hinge-like losses and geometrically simpler with fewer non-differentiable edges, while achieving similar regret bounds. We also establish a general classification regret bound for all losses which induce the same generalized entropy as the zero-one loss.

نظرية الإحصاء المنهجية نظرية الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة إيبلا الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Computationally Efficient Classification Algorithm in Posterior Drift Model: Phase Transition and Minimax Adaptivity

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً