Do you want to publish a course? Click here

A Bennett Inequality for the Missing Mass

63   0   0.0 ( 0 )
 Publication date 2015
and research's language is English




Ask ChatGPT about the research

Novel concentration inequalities are obtained for the missing mass, i.e. the total probability mass of the outcomes not observed in the sample. We derive distribution-free deviation bounds with sublinear exponents in deviation size for missing mass and improve the results of Berend and Kontorovich (2013) and Yari Saeed Khanloo and Haffari (2015) for small deviations which is the most important case in learning theory.



rate research

Read More

We are concerned with obtaining novel concentration inequalities for the missing mass, i.e. the total probability mass of the outcomes not observed in the sample. We not only derive - for the first time - distribution-free Bernstein-like deviation bounds with sublinear exponents in deviation size for missing mass, but also improve the results of McAllester and Ortiz (2003) andBerend and Kontorovich (2013, 2012) for small deviations which is the most interesting case in learning theory. It is known that the majority of standard inequalities cannot be directly used to analyze heterogeneous sums i.e. sums whose terms have large difference in magnitude. Our generic and intuitive approach shows that the heterogeneity issue introduced in McAllester and Ortiz (2003) is resolvable at least in the case of missing mass via regulating the terms using our novel thresholding technique.
We present a new second-order oracle bound for the expected risk of a weighted majority vote. The bound is based on a novel parametric form of the Chebyshev-Cantelli inequality (a.k.a. one-sided Chebyshevs), which is amenable to efficient minimization. The new form resolves the optimization challenge faced by prior oracle bounds based on the Chebyshev-Cantelli inequality, the C-bounds [Germain et al., 2015], and, at the same time, it improves on the oracle bound based on second order Markovs inequality introduced by Masegosa et al. [2020]. We also derive the PAC-Bayes-Bennett inequality, which we use for empirical estimation of the oracle bound. The PAC-Bayes-Bennett inequality improves on the PAC-Bayes-Bernstein inequality by Seldin et al. [2012]. We provide an empirical evaluation demonstrating that the new bounds can improve on the work by Masegosa et al. [2020]. Both the parametric form of the Chebyshev-Cantelli inequality and the PAC-Bayes-Bennett inequality may be of independent interest for the study of concentration of measure in other domains.
In this paper, we are concerned with obtaining distribution-free concentration inequalities for mixture of independent Bernoulli variables that incorporate a notion of variance. Missing mass is the total probability mass associated to the outcomes that have not been seen in a given sample which is an important quantity that connects density estimates obtained from a sample to the population for discrete distributions. Therefore, we are specifically motivated to apply our method to study the concentration of missing mass - which can be expressed as a mixture of Bernoulli - in a novel way. We not only derive - for the first time - Bernstein-like large deviation bounds for the missing mass whose exponents behave almost linearly with respect to deviation size, but also sharpen McAllester and Ortiz (2003) and Berend and Kontorovich (2013) for large sample sizes in the case of small deviations which is the most interesting case in learning theory. In the meantime, our approach shows that the heterogeneity issue introduced in McAllester and Ortiz (2003) is resolvable in the case of missing mass in the sense that one can use standard inequalities but it may not lead to strong results. Thus, we postulate that our results are general and can be applied to provide potentially sharp Bernstein-like bounds under some constraints.
Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks. We experiment on two machine learning benchmark datasets with missing categorical data, comparing classifiers trained on non-imputed (i.e., one-hot encoded) or imputed data with different levels of additional missing-data perturbation. We show imputation methods can increase predictive accuracy in the presence of missing-data perturbation, which can actually improve prediction accuracy by regularizing the classifier. We achieve the state-of-the-art on the Adult dataset with missing-data perturbation and k-nearest-neighbors (k-NN) imputation.
Several statistical models are given in the form of unnormalized densities, and calculation of the normalization constant is intractable. We propose estimation methods for such unnormalized models with missing data. The key concept is to combine imputation techniques with estimators for unnormalized models including noise contrastive estimation and score matching. In addition, we derive asymptotic distributions of the proposed estimators and construct confidence intervals. Simulation results with truncated Gaussian graphical models and the application to real data of wind direction reveal that the proposed methods effectively enable statistical inference with unnormalized models from missing data.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا