Testing Tail Weight of a Distribution Via Hazard Rate

86 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Kavya Ravichandran

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Maryam Aliakbarpour - Amartya Shankha Biswas - Kavya Ravichandran

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Understanding the shape of a distribution of data is of interest to people in a great variety of fields, as it may affect the types of algorithms used for that data. Given samples from a distribution, we seek to understand how many elements appear infrequently, that is, to characterize the tail of the distribution. We develop an algorithm based on a careful bucketing scheme that distinguishes heavy-tailed distributions from non-heavy-tailed ones via a definition based on the hazard rate under some natural smoothness and ordering assumptions. We verify our theoretical results empirically.

قيم البحث

83 - Jun Yang , Shengyang Sun , Daniel M. Roy 2019

The developments of Rademacher complexity and PAC-Bayesian theory have been largely independent. One exception is the PAC-Bayes theorem of Kakade, Sridharan, and Tewari (2008), which is established via Rademacher complexity theory by viewing Gibbs cl assifiers as linear operators. The goal of this paper is to extend this bridge between Rademacher complexity and state-of-the-art PAC-Bayesian theory. We first demonstrate that one can match the fast rate of Catonis PAC-Bayes bounds (Catoni, 2007) using shifted Rademacher processes (Wegkamp, 2003; Lecu{e} and Mitchell, 2012; Zhivotovskiy and Hanneke, 2018). We then derive a new fast-rate PAC-Bayes bound in terms of the flatness of the empirical risk surface on which the posterior concentrates. Our analysis establishes a new framework for deriving fast-rate PAC-Bayes bounds and yields new insights on PAC-Bayesian theory.

التعلم الآلي نظرية الإحصاء التعلم الالي

Testing Determinantal Point Processes

195 - Khashayar Gatmiry 2020

Determinantal point processes (DPPs) are popular probabilistic models of diversity. In this paper, we investigate DPPs from a new perspective: property testing of distributions. Given sample access to an unknown distribution $q$ over the subsets of a ground set, we aim to distinguish whether $q$ is a DPP distribution, or $epsilon$-far from all DPP distributions in $ell_1$-distance. In this work, we propose the first algorithm for testing DPPs. Furthermore, we establish a matching lower bound on the sample complexity of DPP testing. This lower bound also extends to showing a new hardness result for the problem of testing the more general class of log-submodular distributions.

التعلم الآلي نظرية الإحصاء التعلم الالي

Generalized inverse xgamma distribution: A non-monotone hazard rate model

111 - Harsh Tripathi , Abhimanyu Singh Yadav , Mahendra Saha 2018

In this article, a generalized inverse xgamma distribution (GIXGD) has been introduced as the generalized version of the inverse xgamma distribution. The proposed model exhibits the pattern of non-monotone hazard rate and belongs to family of positiv ely skewed models. The explicit expressions of some distributional properties, such as, moments, inverse moments, conditional moments, mean deviation, quantile function have been derived. The maximum likelihood estimation procedure has been used to estimate the unknown model parameters as well as survival characteristics of GIXGD. The practical applicability of the proposed model has been illustrated through a survival data of guinea pigs.

المنهجية تطبيقات الإحصاء

Convolution-Weight-Distribution Assumption: Rethinking the Criteria of Channel Pruning

158 - Zhongzhan Huang , Wenqi Shao , Xinjiang Wang 2020

Channel pruning is a popular technique for compressing convolutional neural networks (CNNs), where various pruning criteria have been proposed to remove the redundant filters. From our comprehensive experiments, we found two blind spots in the study of pruning criteria: (1) Similarity: There are some strong similarities among several primary pruning criteria that are widely cited and compared. According to these criteria, the ranks of filtersImportance Score are almost identical, resulting in similar pruned structures. (2) Applicability: The filtersImportance Score measured by some pruning criteria are too close to distinguish the network redundancy well. In this paper, we analyze these two blind spots on different types of pruning criteria with layer-wise pruning or global pruning. The analyses are based on the empirical experiments and our assumption (Convolutional Weight Distribution Assumption) that the well-trained convolutional filters each layer approximately follow a Gaussian-alike distribution. This assumption has been verified through systematic and extensive statistical tests.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Learning the distribution with largest mean: two bandit frameworks

39 - Emilie Kaufmann 2017

Over the past few years, the multi-armed bandit model has become increasingly popular in the machine learning community, partly because of applications including online content optimization. This paper reviews two different sequential learning tasks that have been considered in the bandit literature ; they can be formulated as (sequentially) learning which distribution has the highest mean among a set of distributions, with some constraints on the learning process. For both of them (regret minimization and best arm identification) we present recent, asymptotically optimal algorithms. We compare the behaviors of the sampling rule of each algorithm as well as the complexity terms associated to each problem.

التعلم الآلي نظرية الإحصاء التعلم الالي