بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Extrapolating the profile of a finite population

99 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yury Polyanskiy

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Soham Jana - Yury Polyanskiy - Yihong Wu

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We study a prototypical problem in empirical Bayes. Namely, consider a population consisting of $k$ individuals each belonging to one of $k$ types (some types can be empty). Without any structural restrictions, it is impossible to learn the composition of the full population having observed only a small (random) subsample of size $m = o(k)$. Nevertheless, we show that in the sublinear regime of $m =omega(k/log k)$, it is possible to consistently estimate in total variation the emph{profile} of the population, defined as the empirical distribution of the sizes of each type, which determines many symmetric properties of the population. We also prove that in the linear regime of $m=c k$ for any constant $c$ the optimal rate is $Theta(1/log k)$. Our estimator is based on Wolfowitzs minimum distance method, which entails solving a linear program (LP) of size $k$. We show that there is a single infinite-dimensional LP whose value simultaneously characterizes the risk of the minimum distance estimator and certifies its minimax optimality. The sharp convergence rate is obtained by evaluating this LP using complex-analytic techniques.

قيم البحث

119 - Marco Chiani , Alberto Zanella 2020

We present some new results on the joint distribution of an arbitrary subset of the ordered eigenvalues of complex Wishart, double Wishart, and Gaussian hermitian random matrices of finite dimensions, using a tensor pseudo-determinant operator. Speci fically, we derive compact expressions for the joint probability distribution function of the eigenvalues and the expectation of functions of the eigenvalues, including joint moments, for the case of both ordered and unordered eigenvalues.

نظرية الإحصاء نظرية المعلومات نظرية المعلومات

Biwhitening Reveals the Rank of a Count Matrix

79 - Boris Landa , Thomas T.C.K. Zhang , Yuval Kluger 2021

Estimating the rank of a corrupted data matrix is an important task in data science, most notably for choosing the number of components in principal component analysis. Significant progress on this task has been made using random matrix theory by cha racterizing the spectral properties of large noise matrices. However, utilizing such tools is not straightforward when the data matrix consists of count random variables, such as Poisson or binomial, in which case the noise can be heteroskedastic with an unknown variance in each entry. In this work, focusing on a Poisson random matrix with independent entries, we propose a simple procedure termed textit{biwhitening} that makes it possible to estimate the rank of the underlying data matrix (i.e., the Poisson parameter matrix) without any prior knowledge on its structure. Our approach is based on the key observation that one can scale the rows and columns of the data matrix simultaneously so that the spectrum of the corresponding noise agrees with the standard Marchenko-Pastur (MP) law, justifying the use of the MP upper edge as a threshold for rank selection. Importantly, the required scaling factors can be estimated directly from the observations by solving a matrix scaling problem via the Sinkhorn-Knopp algorithm. Aside from the Poisson distribution, we extend our biwhitening approach to other discrete distributions, such as the generalized Poisson, binomial, multinomial, and negative binomial. We conduct numerical experiments that corroborate our theoretical findings, and demonstrate our approach on real single-cell RNA sequencing (scRNA-seq) data, where we show that our results agree with a slightly overdispersed generalized Poisson model.

نظرية الإحصاء نظرية المعلومات نظرية المعلومات

On Estimation of Finite Population Proportion

383 - Xinjia Chen 2009

In this paper, we study the classical problem of estimating the proportion of a finite population. First, we consider a fixed sample size method and derive an explicit sample size formula which ensures a mixed criterion of absolute and relative error s. Second, we consider an inverse sampling scheme such that the sampling is continue until the number of units having a certain attribute reaches a threshold value or the whole population is examined. We have established a simple method to determine the threshold so that a prescribed relative precision is guaranteed. Finally, we develop a multistage sampling scheme for constructing fixed-width confidence interval for the proportion of a finite population. Powerful computational techniques are introduced to make it possible that the fixed-width confidence interval ensures prescribed level of coverage probability.

نظرية الإحصاء الاحتمالات تطبيقات الإحصاء

A generalized Liebs theorem and its applications to spectrum estimates for a sum of random matrices

70 - De Huang 2018

In this paper we prove the concavity of the $k$-trace functions, $Amapsto (text{Tr}_k[exp(H+ln A)])^{1/k}$, on the convex cone of all positive definite matrices. $text{Tr}_k[A]$ denotes the $k_{mathrm{th}}$ elementary symmetric polynomial of the eige nvalues of $A$. As an application, we use the concavity of these $k$-trace functions to derive tail bounds and expectation estimates on the sum of the $k$ largest (or smallest) eigenvalues of a sum of random matrices.

نظرية الإحصاء نظرية المعلومات نظرية المعلومات

A simple randomized algorithm for sequential prediction of ergodic time series

372 - L. Gyorfi , G. Lugosi , G. Morvai 2008

We present a simple randomized procedure for the prediction of a binary sequence. The algorithm uses ideas from recent developments of the theory of the prediction of individual sequences. We show that if the sequence is a realization of a stationary and ergodic random process then the average number of mistakes converges, almost surely, to that of the optimum, given by the Bayes predictor. The desirable finite-sample properties of the predictor are illustrated by its performance for Markov processes. In such cases the predictor exhibits near optimal behavior even without knowing the order of the Markov process. Prediction with side information is also considered.

نظرية الإحصاء نظرية المعلومات نظرية المعلومات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشھباء الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Extrapolating the profile of a finite population

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً