ترغب بنشر مسار تعليمي؟ اضغط هنا

Learning convex polytopes with margin

238   0   0.0 ( 0 )
 نشر من قبل Aryeh Kontorovich
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We present an improved algorithm for properly learning convex polytopes in the realizable PAC setting from data with a margin. Our learning algorithm constructs a consistent polytope as an intersection of about $t log t$ halfspaces with margins in time polynomial in $t$ (where $t$ is the number of halfspaces forming an optimal polytope). We also identify distinct generalizations of the notion of margin from hyperplanes to polytopes and investigate how they relate geometrically; this result may be of interest beyond the learning setting.

قيم البحث

اقرأ أيضاً

We propose a new geometric method for measuring the quality of representations obtained from deep learning. Our approach, called Random Polytope Descriptor, provides an efficient description of data points based on the construction of random convex p olytopes. We demonstrate the use of our technique by qualitatively comparing the behavior of classic and regularized autoencoders. This reveals that applying regularization to autoencoder networks may decrease the out-of-distribution detection performance in latent space. While our technique is similar in spirit to $k$-means clustering, we achieve significantly better false positive/negative balance in clustering tasks on autoencoded datasets.
Structured output prediction is an important machine learning problem both in theory and practice, and the max-margin Markov network (mcn) is an effective approach. All state-of-the-art algorithms for optimizing mcn objectives take at least $O(1/epsi lon)$ number of iterations to find an $epsilon$ accurate solution. Recent results in structured optimization suggest that faster rates are possible by exploiting the structure of the objective function. Towards this end citet{Nesterov05} proposed an excessive gap reduction technique based on Euclidean projections which converges in $O(1/sqrt{epsilon})$ iterations on strongly convex functions. Unfortunately when applied to mcn s, this approach does not admit graphical model factorization which, as in many existing algorithms, is crucial for keeping the cost per iteration tractable. In this paper, we present a new excessive gap reduction technique based on Bregman projections which admits graphical model factorization naturally, and converges in $O(1/sqrt{epsilon})$ iterations. Compared with existing algorithms, the convergence rate of our method has better dependence on $epsilon$ and other parameters of the problem, and can be easily kernelized.
Support vector regression (SVR) is one of the most popular machine learning algorithms aiming to generate the optimal regression curve through maximizing the minimal margin of selected training samples, i.e., support vectors. Recent researchers revea l that maximizing the margin distribution of whole training dataset rather than the minimal margin of a few support vectors, is prone to achieve better generalization performance. However, the margin distribution support vector regression machines suffer difficulties resulted from solving a non-convex quadratic optimization, compared to the margin distribution strategy for support vector classification, This paper firstly proposes a maximal margin distribution model for SVR(MMD-SVR), then implementing coupled constrain factor to convert the non-convex quadratic optimization to a convex problem with linear constrains, which enhance the training feasibility and efficiency for SVR to derived from maximizing the margin distribution. The theoretical and empirical analysis illustrates the superiority of MMD-SVR. In addition, numerical experiments show that MMD-SVR could significantly improve the accuracy of prediction and generate more smooth regression curve with better generalization compared with the classic SVR.
315 - Yiwen Guo , Changshui Zhang 2021
This paper serves as a survey of recent advances in large margin training and its theoretical foundations, mostly for (nonlinear) deep neural networks (DNNs) that are probably the most prominent machine learning models for large-scale data in the com munity over the past decade. We generalize the formulation of classification margins from classical research to latest DNNs, summarize theoretical connections between the margin, network generalization, and robustness, and introduce recent efforts in enlarging the margins for DNNs comprehensively. Since the viewpoint of different methods is discrepant, we categorize them into groups for ease of comparison and discussion in the paper. Hopefully, our discussions and overview inspire new research work in the community that aim to improve the performance of DNNs, and we also point to directions where the large margin principle can be verified to provide theoretical evidence why certain regularizations for DNNs function well in practice. We managed to shorten the paper such that the crucial spirit of large margin learning and related methods are better emphasized.
Though learning has become a core technology of modern information processing, there is now ample evidence that it can lead to biased, unsafe, and prejudiced solutions. The need to impose requirements on learning is therefore paramount, especially as it reaches critical applications in social, industrial, and medical domains. However, the non-convexity of most modern learning problems is only exacerbated by the introduction of constraints. Whereas good unconstrained solutions can often be learned using empirical risk minimization (ERM), even obtaining a model that satisfies statistical constraints can be challenging, all the more so a good one. In this paper, we overcome this issue by learning in the empirical dual domain, where constrained statistical learning problems become unconstrained, finite dimensional, and deterministic. We analyze the generalization properties of this approach by bounding the empirical duality gap, i.e., the difference between our approximate, tractable solution and the solution of the original (non-convex)~statistical problem, and provide a practical constrained learning algorithm. These results establish a constrained counterpart of classical learning theory and enable the explicit use of constraints in learning. We illustrate this algorithm and theory in rate-constrained learning applications.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا