Reweighted Wake-Sleep

318 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل J\\\"org Bornschein

تاريخ النشر 2014

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jorg Bornschein - Yoshua Bengio

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Training deep directed graphical models with many hidden variables and performing inference remains a major challenge. Helmholtz machines and deep belief networks are such models, and the wake-sleep algorithm has been proposed to train them. The wake-sleep algorithm relies on training not just the directed generative model but also a conditional generative model (the inference network) that runs backward from visible to latent, estimating the posterior distribution of latent given visible. We propose a novel interpretation of the wake-sleep algorithm which suggests that better estimators of the gradient can be obtained by sampling latent variables multiple times from the inference network. This view is based on importance sampling as an estimator of the likelihood, with the approximate inference network as a proposal distribution. This interpretation is confirmed experimentally, showing that better likelihood can be achieved with this reweighted wake-sleep procedure. Based on this interpretation, we propose that a sigmoidal belief network is not sufficiently powerful for the layers of the inference network in order to recover a good estimator of the posterior distribution of latent variables. Our experiments show that using a more powerful layer model, such as NADE, yields substantially better generative models.

قيم البحث

اقرأ أيضاً

Natural Wake-Sleep Algorithm

190 - Csongor Varady 2020

The benefits of using the natural gradient are well known in a wide range of optimization problems. However, for the training of common neural networks the resulting increase in computational complexity sets a limitation to its practical application. Helmholtz Machines are a particular type of generative model composed of two Sigmoid Belief Networks (SBNs), acting as an encoder and a decoder, commonly trained using the Wake-Sleep (WS) algorithm and its reweighted version RWS. For SBNs, it has been shown how the locality of the connections in the graphical structure induces sparsity in the Fisher information matrix. The resulting block diagonal structure can be efficiently exploited to reduce the computational complexity of the Fisher matrix inversion and thus compute the natural gradient exactly, without the need of approximations. We present a geometric adaptation of well-known methods from the literature, introducing the Natural Wake-Sleep (NWS) and the Natural Reweighted Wake-Sleep (NRWS) algorithms. We present an experimental analysis of the novel geometrical algorithms based on the convergence speed and the value of the log-likelihood, both with respect to the number of iterations and the time complexity and demonstrating improvements on these aspects over their respective non-geometric baselines.

التعلم الآلي التعلم الالي

Low-Power Status Updates via Sleep-Wake Scheduling

92 - Ahmed M. Bedewy , Yin Sun , Rahul Singh 2021

We consider the problem of optimizing the freshness of status updates that are sent from a large number of low-power sources to a common access point. The source nodes utilize carrier sensing to reduce collisions and adopt an asynchronized sleep-wake scheduling strategy to achieve a target network lifetime (e.g., 10 years). We use age of information (AoI) to measure the freshness of status updates, and design sleep-wake parameters for minimizing the weighted-sum peak AoI of the sources, subject to per-source battery lifetime constraints. When the sensing time (i.e., the time duration of carrier sensing) is zero, this sleep-wake design problem can be solved by resorting to a two-layer nested convex optimization procedure; however, for positive sensing times, the problem is non-convex. We devise a low-complexity solution to solve this problem and prove that, for practical sensing times that are short, the solution is within a small gap from the optimum AoI performance. When the mean transmission time of status-update packets is unknown, we devise a reinforcement learning algorithm that adaptively performs the following two tasks in an ``efficient way: a) it learns the unknown parameter, b) it also generates efficient controls that make channel access decisions. We analyze its performance by quantifying its ``regret, i.e., the sub-optimality gap between its average performance and the average performance of a controller that knows the mean transmission time. Our numerical and NS-3 simulation results show that our solution can indeed elongate the batteries lifetime of information sources, while providing a competitive AoI performance.

نظرية المعلومات نظرية المعلومات

Hybrid Memoised Wake-Sleep: Approximate Inference at the Discrete-Continuous Interface

86 - Tuan Anh Le , Katherine M. Collins , Luke Hewitt 2021

Modeling complex phenomena typically involves the use of both discrete and continuous variables. Such a setting applies across a wide range of problems, from identifying trends in time-series data to performing effective compositional scene understan ding in images. Here, we propose Hybrid Memoised Wake-Sleep (HMWS), an algorithm for effective inference in such hybrid discrete-continuous models. Prior approaches to learning suffer as they need to perform repeated expensive inner-loop discrete inference. We build on a recent approach, Memoised Wake-Sleep (MWS), which alleviates part of the problem by memoising discrete variables, and extend it to allow for a principled and effective way to handle continuous variables by learning a separate recognition model used for importance-sampling based approximate inference and marginalization. We evaluate HMWS in the GP-kernel learning and 3D scene understanding domains, and show that it outperforms current state-of-the-art inference methods.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي

Geometry-aware Instance-reweighted Adversarial Training

67 - Jingfeng Zhang , Jianing Zhu , Gang Niu 2020

In adversarial machine learning, there was a common belief that robustness and accuracy hurt each other. The belief was challenged by recent studies where we can maintain the robustness and improve the accuracy. However, the other direction, whether we can keep the accuracy while improving the robustness, is conceptually and practically more interesting, since robust accuracy should be lower than standard accuracy for any model. In this paper, we show this direction is also promising. Firstly, we find even over-parameterized deep networks may still have insufficient model capacity, because adversarial training has an overwhelming smoothing effect. Secondly, given limited model capacity, we argue adversarial data should have unequal importance: geometrically speaking, a natural data point closer to/farther from the class boundary is less/more robust, and the corresponding adversarial data point should be assigned with larger/smaller weight. Finally, to implement the idea, we propose geometry-aware instance-reweighted adversarial training, where the weights are based on how difficult it is to attack a natural data point. Experiments show that our proposal boosts the robustness of standard adversarial training; combining two directions, we improve both robustness and accuracy of standard adversarial training.

التعلم الآلي الذكاء الاصطناعي

Sleep-wake classification via quantifying heart rate variability by convolutional neural network

120 - John Malik , Yu-Lun Lo , Hau-tieng Wu 2018

Fluctuations in heart rate are intimately tied to changes in the physiological state of the organism. We examine and exploit this relationship by classifying a human subjects wake/sleep status using his instantaneous heart rate (IHR) series. We use a convolutional neural network (CNN) to build features from the IHR series extracted from a whole-night electrocardiogram (ECG) and predict every 30 seconds whether the subject is awake or asleep. Our training database consists of 56 normal subjects, and we consider three different databases for validation; one is private, and two are public with different races and apnea severities. On our private database of 27 subjects, our accuracy, sensitivity, specificity, and AUC values for predicting the wake stage are 83.1%, 52.4%, 89.4%, and 0.83, respectively. Validation performance is similar on our two public databases. When we use the photoplethysmography instead of the ECG to obtain the IHR series, the performance is also comparable. A robustness check is carried out to confirm the obtained performance statistics. This result advocates for an effective and scalable method for recognizing changes in physiological state using non-invasive heart rate monitoring. The CNN model adaptively quantifies IHR fluctuation as well as its location in time and is suitable for differentiating between the wake and sleep stages.

تطبيقات الإحصاء تحليل البيانات والإحصاءات والاحتمال التعلم الالي