بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Testing Mixtures of Discrete Distributions

110 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Maryam Aliakbarpour

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Maryam Aliakbarpour - Ravi Kumar - Ronitt Rubinfeld

نظرية الإحصاء بنى وهياكل البيانات والخوارزميات التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

There has been significant study on the sample complexity of testing properties of distributions over large domains. For many properties, it is known that the sample complexity can be substantially smaller than the domain size. For example, over a domain of size $n$, distinguishing the uniform distribution from distributions that are far from uniform in $ell_1$-distance uses only $O(sqrt{n})$ samples. However, the picture is very different in the presence of arbitrary noise, even when the amount of noise is quite small. In this case, one must distinguish if samples are coming from a distribution that is $epsilon$-close to uniform from the case where the distribution is $(1-epsilon)$-far from uniform. The latter task requires nearly linear in $n$ samples [Valiant 2008, Valian and Valiant 2011]. In this work, we present a noise model that on one hand is more tractable for the testing problem, and on the other hand represents a rich class of noise families. In our model, the noisy distribution is a mixture of the original distribution and noise, where the latter is known to the tester either explicitly or via sample access; the form of the noise is also known a priori. Focusing on the identity and closeness testing problems leads to the following mixture testing question: Given samples of distributions $p, q_1,q_2$, can we test if $p$ is a mixture of $q_1$ and $q_2$? We consider this general question in various scenarios that differ in terms of how the tester can access the distributions, and show that indeed this problem is more tractable. Our results show that the sample complexity of our testers are exactly the same as for the classical non-mixture case.

قيم البحث

79 - Thomas B. Berrett , Cristina Butucea 2020

We find separation rates for testing multinomial or more general discrete distributions under the constraint of local differential privacy. We construct efficient randomized algorithms and test procedures, in both the case where only non-interactive privacy mechanisms are allowed and also in the case where all sequentially interactive privacy mechanisms are allowed. The separation rates are faster in the latter case. We prove general information theoretical bounds that allow us to establish the optimality of our algorithms among all pairs of privacy mechanisms and test procedures, in most usual cases. Considered examples include testing uniform, polynomially and exponentially decreasing distributions.

نظرية الإحصاء المنهجية التعلم الالي

Differentially Private Identity and Closeness Testing of Discrete Distributions

148 - Maryam Aliakbarpour , Ilias Diakonikolas , Ronitt Rubinfeld 2017

We investigate the problems of identity and closeness testing over a discrete population from random samples. Our goal is to develop efficient testers while guaranteeing Differential Privacy to the individuals of the population. We describe an approa ch that yields sample-efficient differentially private testers for these problems. Our theoretical results show that there exist private identity and closeness testers that are nearly as sample-efficient as their non-private counterparts. We perform an experimental evaluation of our algorithms on synthetic data. Our experiments illustrate that our private testers achieve small type I and type II errors with sample size sublinear in the domain size of the underlying distributions.

التعلم الآلي بنى وهياكل البيانات والخوارزميات نظرية المعلومات

An alternative approach for compatibility of two discrete conditional distributions

47 - Indranil Ghosh , Saralees Nadarajah 2017

Conditional specification of distributions is a developing area with increasing applications. In the finite discrete case, a variety of compatible conditions can be derived. In this paper, we propose an alternative approach to study the compatibility of two conditional probability distributions under the finite discrete setup. A technique based on rank-based criterion is shown to be particularly convenient for identifying compatible distributions corresponding to complete conditional specification including the case with zeros.The proposed methods are illustrated with several examples.

نظرية الإحصاء نظرية الإحصاء

Fitting phase--type scale mixtures to heavy--tailed data and distributions

59 - Mogens Bladt , Leonardo Rojas-Nandayapa 2017

We consider the fitting of heavy tailed data and distribution with a special attention to distributions with a non--standard shape in the body of the distribution. To this end we consider a dense class of heavy tailed distributions introduced recentl y, employing an EM algorithm for the the maximum likelihood estimates of its parameters. We present methods for fitting to observed data, histograms, censored data, as well as to theoretical distributions. Numerical examples are provided with simulated data and a benchmark reinsurance dataset. We empirically demonstrate that our model can provide excellent fits to heavy--tailed data/distributions with minimal assumptions

نظرية الإحصاء نظرية الإحصاء

Convergence Rates for Gaussian Mixtures of Experts

264 - Nhat Ho , Chiao-Yu Yang , Michael I. Jordan 2019

We provide a theoretical treatment of over-specified Gaussian mixtures of experts with covariate-free gating networks. We establish the convergence rates of the maximum likelihood estimation (MLE) for these models. Our proof technique is based on a n ovel notion of emph{algebraic independence} of the expert functions. Drawing on optimal transport theory, we establish a connection between the algebraic independence and a certain class of partial differential equations (PDEs). Exploiting this connection allows us to derive convergence rates and minimax lower bounds for parameter estimation.

نظرية الإحصاء التعلم الآلي التعلم الالي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة دمشق

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Testing Mixtures of Discrete Distributions

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً