Differentiable Greedy Networks

91 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Rasool Fakoor

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Thomas Powers - Rasool Fakoor - Siamak Shakeri

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Optimal selection of a subset of items from a given set is a hard problem that requires combinatorial optimization. In this paper, we propose a subset selection algorithm that is trainable with gradient-based methods yet achieves near-optimal performance via submodular optimization. We focus on the task of identifying a relevant set of sentences for claim verification in the context of the FEVER task. Conventional methods for this task look at sentences on their individual merit and thus do not optimize the informativeness of sentences as a set. We show that our proposed method which builds on the idea of unfolding a greedy algorithm into a computational graph allows both interpretability and gradient-based training. The proposed differentiable greedy network (DGN) outperforms discrete optimization algorithms as well as other baseline methods in terms of precision and recall.

قيم البحث

360 - Sungyong Seo , Yan Liu 2019

While physics conveys knowledge of nature built from an interplay between observations and theory, it has been considered less importantly in deep neural networks. Especially, there are few works leveraging physics behaviors when the knowledge is giv en less explicitly. In this work, we propose a novel architecture called Differentiable Physics-informed Graph Networks (DPGN) to incorporate implicit physics knowledge which is given from domain experts by informing it in latent space. Using the concept of DPGN, we demonstrate that climate prediction tasks are significantly improved. Besides the experiment results, we validate the effectiveness of the proposed module and provide further applications of DPGN, such as inductive learning and multistep predictions.

التعلم الآلي التعلم الالي

Towards Deeper Graph Neural Networks with Differentiable Group Normalization

86 - Kaixiong Zhou , Xiao Huang , Yuening Li 2020

Graph neural networks (GNNs), which learn the representation of a node by aggregating its neighbors, have become an effective computational tool in downstream applications. Over-smoothing is one of the key issues which limit the performance of GNNs a s the number of layers increases. It is because the stacked aggregators would make node representations converge to indistinguishable vectors. Several attempts have been made to tackle the issue by bringing linked node pairs close and unlinked pairs distinct. However, they often ignore the intrinsic community structures and would result in sub-optimal performance. The representations of nodes within the same community/class need be similar to facilitate the classification, while different classes are expected to be separated in embedding space. To bridge the gap, we introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN). It normalizes nodes within the same group independently to increase their smoothness, and separates node distributions among different groups to significantly alleviate the over-smoothing issue. Experiments on real-world datasets demonstrate that DGN makes GNN models more robust to over-smoothing and achieves better performance with deeper GNNs.

التعلم الآلي التعلم الالي

Differentiable PAC-Bayes Objectives with Partially Aggregated Neural Networks

104 - Felix Biggs , Benjamin Guedj 2020

We make three related contributions motivated by the challenge of training stochastic neural networks, particularly in a PAC-Bayesian setting: (1) we show how averaging over an ensemble of stochastic neural networks enables a new class of emph{partia lly-aggregated} estimators; (2) we show that these lead to provably lower-variance gradient estimates for non-differentiable signed-output networks; (3) we reformulate a PAC-Bayesian bound for these networks to derive a directly optimisable, differentiable objective and a generalisation guarantee, without using a surrogate loss or loosening the bound. This bound is twice as tight as that of Letarte et al. (2019) on a similar network type. We show empirically that these innovations make training easier and lead to competitive guarantees.

التعلم الآلي التعلم الالي

Differentiable Bandit Exploration

362 - Craig Boutilier , Chih-Wei Hsu , Branislav Kveton 2020

Exploration policies in Bayesian bandits maximize the average reward over problem instances drawn from some distribution $mathcal{P}$. In this work, we learn such policies for an unknown distribution $mathcal{P}$ using samples from $mathcal{P}$. Our approach is a form of meta-learning and exploits properties of $mathcal{P}$ without making strong assumptions about its form. To do this, we parameterize our policies in a differentiable way and optimize them by policy gradients, an approach that is general and easy to implement. We derive effective gradient estimators and introduce novel variance reduction techniques. We also analyze and experiment with various bandit policy classes, including neural networks and a novel softmax policy. The latter has regret guarantees and is a natural starting point for our optimization. Our experiments show the versatility of our approach. We also observe that neural network policies can learn implicit biases expressed only through the sampled instances.

التعلم الآلي التعلم الالي

DiBS: Differentiable Bayesian Structure Learning

199 - Lars Lorch , Jonas Rothfuss , Bernhard Scholkopf 2021

Bayesian structure learning allows inferring Bayesian network structure from data while reasoning about the epistemic uncertainty -- a key element towards enabling active causal discovery and designing interventions in real world systems. In this wor k, we propose a general, fully differentiable framework for Bayesian structure learning (DiBS) that operates in the continuous space of a latent probabilistic graph representation. Contrary to existing work, DiBS is agnostic to the form of the local conditional distributions and allows for joint posterior inference of both the graph structure and the conditional distribution parameters. This makes DiBS directly applicable to posterior inference of nonstandard Bayesian network models, e.g., with nonlinear dependencies encoded by neural networks. Building on recent advances in variational inference, we use DiBS to devise an efficient general purpose method for approximating posteriors over structural models. In evaluations on simulated and real-world data, our method significantly outperforms related approaches to joint posterior inference.

التعلم الآلي التعلم الالي