ترغب بنشر مسار تعليمي؟ اضغط هنا

Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models

353   0   0.0 ( 0 )
 نشر من قبل Mor Shpigel Nacson
 تاريخ النشر 2019
والبحث باللغة English




اسأل ChatGPT حول البحث

With an eye toward understanding complexity control in deep learning, we study how infinitesimal regularization or gradient descent optimization lead to margin maximizing solutions in both homogeneous and non-homogeneous models, extending previous work that focused on infinitesimal regularization only in homogeneous models. To this end we study the limit of loss minimization with a diverging norm constraint (the constrained path), relate it to the limit of a margin path and characterize the resulting solution. For non-homogeneous ensemble models, which output is a sum of homogeneous sub-models, we show that this solution discards the shallowest sub-models if they are unnecessary. For homogeneous models, we show convergence to a lexicographic max-margin solution, and provide conditions under which max-margin solutions are also attained as the limit of unconstrained gradient descent.

قيم البحث

اقرأ أيضاً

Let $mathcal{H}_d^{(t)}$ ($t geq -d$, $t>-3$) be the reproducing kernel Hilbert space on the unit ball $mathbb{B}_d$ with kernel [ k(z,w) = frac{1}{(1-langle z, w rangle)^{d+t+1}} . ] We prove that if an ideal $I triangleleft mathbb{C}[z_1, ldots, z_ d]$ (not necessarily homogeneous) has what we call the approximate stable division property, then the closure of $I$ in $mathcal{H}_d^{(t)}$ is $p$-essentially normal for all $p>d$. We then show that all quasi homogeneous ideals in two variables have the stable division property, and combine these two results to obtain a new proof of the fact that the closure of any quasi homogeneous ideal in $mathbb{C}[x,y]$ is $p$-essentially normal for $p>2$.
We develop a general method for estimating a finite mixture of non-normalized models. Here, a non-normalized model is defined to be a parametric distribution with an intractable normalization constant. Existing methods for estimating non-normalized m odels without computing the normalization constant are not applicable to mixture models because they contain more than one intractable normalization constant. The proposed method is derived by extending noise contrastive estimation (NCE), which estimates non-normalized models by discriminating between the observed data and some artificially generated noise. We also propose an extension of NCE with multiple noise distributions. Then, based on the observation that conventional classification learning with neural networks is implicitly assuming an exponential family as a generative model, we introduce a method for clustering unlabeled data by estimating a finite mixture of distributions in an exponential family. Estimation of this mixture model is attained by the proposed extensions of NCE where the training data of neural networks are used as noise. Thus, the proposed method provides a probabilistically principled clustering method that is able to utilize a deep representation. Application to image clustering using a deep neural network gives promising results.
In our work we study non-variational, nonlinear singularly perturbed elliptic models enjoying a double degeneracy character with prescribed boundary value in a domain. In such a scenario, we establish the existence of solutions. We also prove that so lutions are locally (uniformly) Lipschitz continuous, and they grow in a linear fashion. Moreover, solutions and their free boundaries possess a sort of measure-theoretic and weak geometric properties. Moreover, for a restricted class of non-linearities, we prove the finiteness of the (N-1)-dimensional Hausdorff measure of level sets. We also address a complete analysis concerning the asymptotic limit as the singular parameter, which is related to one-phase solutions of inhomogeneous nonlinear free boundary problems in flame propagation and combustion theory.
Motivated by recent experimental and numerical results, a simple unifying picture of intermittency in turbulent shear flows is suggested. Integral Structure Functions (ISF), taking into account explicitly the shear intensity, are introduced on phenom enological grounds. ISF can exhibit a universal scaling behavior, independent of the shear intensity. This picture is in satisfactory agreement with both experimental and numerical data. Possible extension to convective turbulence and implication on closure conditions for Large-Eddy Simulation of non-homogeneous flows are briefly discussed.
The numerical computation of chemical potential in dense, non-homogeneous fluids is a key problem in the study of confined fluids thermodynamics. To this day several methods have been proposed, however there is still need for a robust technique, capa ble of obtaining accurate estimates at large average densities. A widely established technique is the Widom insertion method, that computes the chemical potential by sampling the energy of insertion of a test particle. Non-homogeneity is accounted for by assigning a density dependent weight to the insertion points. However, in dense systems, the poor sampling of the insertion energy is a source of inefficiency, hampering a reliable convergence. We have recently presented a new technique for the chemical potential calculation in homogeneous fluids. This novel method enhances the sampling of the insertion energy via Well-Tempered Metadynamics, reaching accurate estimates at very large densities. In this paper we extend the technique to the case of non-homogeneous fluids. The method is successfully tested on a confined Lennard-Jones fluid. In particular we show that, thanks to the improved sampling, our technique does not suffer from a systematic error that affects the classic Widom method for non-homogeneous fluids, providing a precise and accurate result.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا