ترغب بنشر مسار تعليمي؟ اضغط هنا

Hierarchical Indian Buffet Neural Networks for Bayesian Continual Learning

95   0   0.0 ( 0 )
 نشر من قبل Samuel Kessler
 تاريخ النشر 2019
والبحث باللغة English




اسأل ChatGPT حول البحث

We place an Indian Buffet process (IBP) prior over the structure of a Bayesian Neural Network (BNN), thus allowing the complexity of the BNN to increase and decrease automatically. We further extend this model such that the prior on the structure of each hidden layer is shared globally across all layers, using a Hierarchical-IBP (H-IBP). We apply this model to the problem of resource allocation in Continual Learning (CL) where new tasks occur and the network requires extra resources. Our model uses online variational inference with reparameterisation of the Bernoulli and Beta distributions, which constitute the IBP and H-IBP priors. As we automatically learn the number of weights in each layer of the BNN, overfitting and underfitting problems are largely overcome. We show empirically that our approach offers a competitive edge over existing methods in CL.



قيم البحث

اقرأ أيضاً

In federated learning problems, data is scattered across different servers and exchanging or pooling it is often impractical or prohibited. We develop a Bayesian nonparametric framework for federated learning with neural networks. Each data server is assumed to provide local neural network weights, which are modeled through our framework. We then develop an inference approach that allows us to synthesize a more expressive global network without additional supervision, data pooling and with as few as a single communication round. We then demonstrate the efficacy of our approach on federated learning problems simulated from two popular image classification datasets.
We develop variational Laplace for Bayesian neural networks (BNNs) which exploits a local approximation of the curvature of the likelihood to estimate the ELBO without the need for stochastic sampling of the neural-network weights. The Variational La place objective is simple to evaluate, as it is (in essence) the log-likelihood, plus weight-decay, plus a squared-gradient regularizer. Variational Laplace gave better test performance and expected calibration errors than maximum a-posteriori inference and standard sampling-based variational inference, despite using the same variational approximate posterior. Finally, we emphasise care needed in benchmarking standard VI as there is a risk of stopping before the variance parameters have converged. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
Bayesian nonparametric hierarchical priors are highly effective in providing flexible models for latent data structures exhibiting sharing of information between and across groups. Most prominent is the Hierarchical Dirichlet Process (HDP), and its s ubsequent variants, which model latent clustering between and across groups. The HDP, may be viewed as a more flexible extension of Latent Dirichlet Allocation models (LDA), and has been applied to, for example, topic modelling, natural language processing, and datasets arising in health-care. We focus on analogous latent feature allocation models, where the data structures correspond to multisets or unbounded sparse matrices. The fundamental development in this regard is the Hierarchical Indian Buffet process (HIBP), which utilizes a hierarchy of Beta processes over J groups, where each group generates binary random matrices, reflecting within group sharing of features, according to beta-Bernoulli IBP priors. To encompass HI
How users in a dynamic system perform learning and make decision become more and more important in numerous research fields. Although there are some works in the social learning literatures regarding how to construct belief on an uncertain system sta te, few study has been conducted on incorporating social learning with decision making. Moreover, users may have multiple concurrent decisions on different objects/resources and their decisions usually negatively influence each others utility, which makes the problem even more challenging. In this paper, we propose an Indian Buffet Game to study how users in a dynamic system learn the uncertain system state and make multiple concurrent decisions by not only considering the current myopic utility, but also taking into account the influence of subsequent users decisions. We analyze the proposed Indian Buffet Game under two different scenarios: customers request multiple dishes without budget constraint and with budget constraint. For both cases, we design recursive best response algorithms to find the subgame perfect Nash equilibrium for customers and characterize special properties of the Nash equilibrium profile under homogeneous setting. Moreover, we introduce a non-Bayesian social learning algorithm for customers to learn the system state, and theoretically prove its convergence. Finally, we conduct simulations to validate the effectiveness and efficiency of the proposed algorithms.
Bayesian Neural Networks (BNNs) have recently received increasing attention for their ability to provide well-calibrated posterior uncertainties. However, model selection---even choosing the number of nodes---remains an open question. Recent work has proposed the use of a horseshoe prior over node pre-activations of a Bayesian neural network, which effectively turns off nodes that do not help explain the data. In this work, we propose several modeling and inference advances that consistently improve the compactness of the model learned while maintaining predictive performance, especially in smaller-sample settings including reinforcement learning.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا