ترغب بنشر مسار تعليمي؟ اضغط هنا

Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One

482   0   0.0 ( 0 )
 نشر من قبل Will Grathwohl
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We propose to reinterpret a standard discriminative classifier of p(y|x) as an energy based model for the joint distribution p(x,y). In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y). Within this framework, standard discriminative architectures may beused and the model can also be trained on unlabeled data. We demonstrate that energy based training of the joint distribution improves calibration, robustness, andout-of-distribution detection while also enabling our models to generate samplesrivaling the quality of recent GAN approaches. We improve upon recently proposed techniques for scaling up the training of energy based models and presentan approach which adds little overhead compared to standard classification training. Our approach is the first to achieve performance rivaling the state-of-the-artin both generative and discriminative learning within one hybrid model.



قيم البحث

اقرأ أيضاً

We show that the sum of the implicit generator log-density $log p_g$ of a GAN with the logit score of the discriminator defines an energy function which yields the true data density when the generator is imperfect but the discriminator is optimal, th us making it possible to improve on the typical generator (with implicit density $p_g$). To make that practical, we show that sampling from this modified density can be achieved by sampling in latent space according to an energy-based model induced by the sum of the latent prior log-density and the discriminator output score. This can be achieved by running a Langevin MCMC in latent space and then applying the generator function, which we call Discriminator Driven Latent Sampling~(DDLS). We show that DDLS is highly efficient compared to previous methods which work in the high-dimensional pixel space and can be applied to improve on previously trained GANs of many types. We evaluate DDLS on both synthetic and real-world datasets qualitatively and quantitatively. On CIFAR-10, DDLS substantially improves the Inception Score of an off-the-shelf pre-trained SN-GAN~citep{sngan} from $8.22$ to $9.09$ which is even comparable to the class-conditional BigGAN~citep{biggan} model. This achieves a new state-of-the-art in unconditional image synthesis setting without introducing extra parameters or additional training.
Topological data analysis aims to extract topological quantities from data, which tend to focus on the broader global structure of the data rather than local information. The Mapper method, specifically, generalizes clustering methods to identify sig nificant global mathematical structures, which are out of reach of many other approaches. We propose a classifier based on applying the Mapper algorithm to data projected onto a latent space. We obtain the latent space by using PCA or autoencoders. Notably, a classifier based on the Mapper method is immune to any gradient based attack, and improves robustness over traditional CNNs (convolutional neural networks). We report theoretical justification and some numerical experiments that confirm our claims.
The C/O-ratio as traced with C$_2$H emission in protoplanetary disks is fundamental for constraining the formation mechanisms of exoplanets and our understanding of volatile depletion in disks, but current C$_2$H observations show an apparent bimodal distribution which is not well understood, indicating that the C/O distribution is not described by a simple radial dependence. The transport of icy pebbles has been suggested to alter the local elemental abundances in protoplanetary disks, through settling, drift and trapping in pressure bumps resulting in a depletion of volatiles in the surface and an increase of the elemental C/O. We combine all disks with spatially resolved ALMA C$_2$H observations with high-resolution continuum images and constraints on the CO snowline to determine if the C$_2$H emission is indeed related to the location of the icy pebbles. We report a possible correlation between the presence of a significant CO-icy dust reservoir and high C$_2$H emission, which is only found in disks with dust rings outside the CO snowline. In contrast, compact dust disks (without pressure bumps) and warm transition disks (with their dust ring inside the CO snowline) are not detected in C$_2$H, suggesting that such disks may never have contained a significant CO ice reservoir. This correlation provides evidence for the regulation of the C/O profile by the complex interplay of CO snowline and pressure bump locations in the disk. These results demonstrate the importance of including dust transport in chemical disk models, for a proper interpretation of exoplanet atmospheric compositions, and a better understanding of volatile depletion in disks, in particular the use of CO isotopologues to determine gas surface densities.
Neural networks are commonly used as models for classification for a wide variety of tasks. Typically, a learned affine transformation is placed at the end of such models, yielding a per-class value used for classification. This classifier can have a vast number of parameters, which grows linearly with the number of possible classes, thus requiring increasingly more resources. In this work we argue that this classifier can be fixed, up to a global scale constant, with little or no loss of accuracy for most tasks, allowing memory and computational benefits. Moreover, we show that by initializing the classifier with a Hadamard matrix we can speed up inference as well. We discuss the implications for current understanding of neural network models.
Evaluation of hydrocarbon reservoir requires classification of petrophysical properties from available dataset. However, characterization of reservoir attributes is difficult due to the nonlinear and heterogeneous nature of the subsurface physical pr operties. In this context, present study proposes a generalized one class classification framework based on Support Vector Data Description (SVDD) to classify a reservoir characteristic water saturation into two classes (Class high and Class low) from four logs namely gamma ray, neutron porosity, bulk density, and P sonic using an imbalanced dataset. A comparison is carried out among proposed framework and different supervised classification algorithms in terms of g metric means and execution time. Experimental results show that proposed framework has outperformed other classifiers in terms of these performance evaluators. It is envisaged that the classification analysis performed in this study will be useful in further reservoir modeling.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا