ترغب بنشر مسار تعليمي؟ اضغط هنا

Novel concentration inequalities are obtained for the missing mass, i.e. the total probability mass of the outcomes not observed in the sample. We derive distribution-free deviation bounds with sublinear exponents in deviation size for missing mass a nd improve the results of Berend and Kontorovich (2013) and Yari Saeed Khanloo and Haffari (2015) for small deviations which is the most important case in learning theory.
We are concerned with obtaining novel concentration inequalities for the missing mass, i.e. the total probability mass of the outcomes not observed in the sample. We not only derive - for the first time - distribution-free Bernstein-like deviation bo unds with sublinear exponents in deviation size for missing mass, but also improve the results of McAllester and Ortiz (2003) andBerend and Kontorovich (2013, 2012) for small deviations which is the most interesting case in learning theory. It is known that the majority of standard inequalities cannot be directly used to analyze heterogeneous sums i.e. sums whose terms have large difference in magnitude. Our generic and intuitive approach shows that the heterogeneity issue introduced in McAllester and Ortiz (2003) is resolvable at least in the case of missing mass via regulating the terms using our novel thresholding technique.
In this paper, we are concerned with obtaining distribution-free concentration inequalities for mixture of independent Bernoulli variables that incorporate a notion of variance. Missing mass is the total probability mass associated to the outcomes th at have not been seen in a given sample which is an important quantity that connects density estimates obtained from a sample to the population for discrete distributions. Therefore, we are specifically motivated to apply our method to study the concentration of missing mass - which can be expressed as a mixture of Bernoulli - in a novel way. We not only derive - for the first time - Bernstein-like large deviation bounds for the missing mass whose exponents behave almost linearly with respect to deviation size, but also sharpen McAllester and Ortiz (2003) and Berend and Kontorovich (2013) for large sample sizes in the case of small deviations which is the most interesting case in learning theory. In the meantime, our approach shows that the heterogeneity issue introduced in McAllester and Ortiz (2003) is resolvable in the case of missing mass in the sense that one can use standard inequalities but it may not lead to strong results. Thus, we postulate that our results are general and can be applied to provide potentially sharp Bernstein-like bounds under some constraints.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا