A Perspective on Objects and Systematic Generalization in Model-Based RL

180 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sjoerd van Steenkiste

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sjoerd van Steenkiste - Klaus Greff - Jurgen Schmidhuber

التعلم الآلي الذكاء الاصطناعي الحوسبة العصبية والتطورية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In order to meet the diverse challenges in solving many real-world problems, an intelligent agent has to be able to dynamically construct a model of its environment. Objects facilitate the modular reuse of prior knowledge and the combinatorial construction of such models. In this work, we argue that dynamically bound features (objects) do not simply emerge in connectionist models of the world. We identify several requirements that need to be fulfilled in overcoming this limitation and highlight corresponding inductive biases.

قيم البحث

65 - Robert Csordas , Kazuki Irie , Jurgen Schmidhuber 2021

Recently, many datasets have been proposed to test the systematic generalization ability of neural networks. The companion baseline Transformers, typically trained with default hyper-parameters from standard tasks, are shown to fail dramatically. Her e we demonstrate that by revisiting model configurations as basic as scaling of embeddings, early stopping, relative positional embedding, and Universal Transformer variants, we can drastically improve the performance of Transformers on systematic generalization. We report improvements on five popular datasets: SCAN, CFQ, PCFG, COGS, and Mathematics dataset. Our models improve accuracy from 50% to 85% on the PCFG productivity split, and from 35% to 81% on COGS. On SCAN, relative positional embedding largely mitigates the EOS decision problem (Newman et al., 2020), yielding 100% accuracy on the length split with a cutoff at 26. Importantly, performance differences between these models are typically invisible on the IID data split. This calls for proper generalization validation sets for developing neural networks that generalize systematically. We publicly release the code to reproduce our results.

التعلم الآلي الذكاء الاصطناعي الحوسبة العصبية والتطورية

Bilinear Classes: A Structural Framework for Provable Generalization in RL

133 - Simon S. Du , Sham M. Kakade , Jason D. Lee 2021

This work introduces Bilinear Classes, a new structural framework, which permit generalization in reinforcement learning in a wide variety of settings through the use of function approximation. The framework incorporates nearly all existing models in which a polynomial sample complexity is achievable, and, notably, also includes new models, such as the Linear $Q^*/V^*$ model in which both the optimal $Q$-function and the optimal $V$-function are linear in some known feature space. Our main result provides an RL algorithm which has polynomial sample complexity for Bilinear Classes; notably, this sample complexity is stated in terms of a reduction to the generalization error of an underlying supervised learning sub-problem. These bounds nearly match the best known sample complexity bounds for existing models. Furthermore, this framework also extends to the infinite dimensional (RKHS) setting: for the the Linear $Q^*/V^*$ model, linear MDPs, and linear mixture MDPs, we provide sample complexities that have no explicit dependence on the explicit feature dimension (which could be infinite), but instead depends only on information theoretic quantities.

التعلم الآلي الذكاء الاصطناعي التحسين والتحكم

A HINT from Arithmetic: On Systematic Generalization of Perception, Syntax, and Semantics

102 - Qing Li , Siyuan Huang , Yining Hong 2021

Inspired by humans remarkable ability to master arithmetic and generalize to unseen problems, we present a new dataset, HINT, to study machines capability of learning generalizable concepts at three different levels: perception, syntax, and semantics . In particular, concepts in HINT, including both digits and operators, are required to learn in a weakly-supervised fashion: Only the final results of handwriting expressions are provided as supervision. Learning agents need to reckon how concepts are perceived from raw signals such as images (i.e., perception), how multiple concepts are structurally combined to form a valid expression (i.e., syntax), and how concepts are realized to afford various reasoning tasks (i.e., semantics). With a focus on systematic generalization, we carefully design a five-fold test set to evaluate both the interpolation and the extrapolation of learned concepts. To tackle this challenging problem, we propose a neural-symbolic system by integrating neural networks with grammar parsing and program synthesis, learned by a novel deduction--abduction strategy. In experiments, the proposed neural-symbolic system demonstrates strong generalization capability and significantly outperforms end-to-end neural methods like RNN and Transformer. The results also indicate the significance of recursive priors for extrapolation on syntax and semantics.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

141 - Dibya Ghosh , Jad Rahme , Aviral Kumar 2021

Generalization is a central challenge for the deployment of reinforcement learning (RL) systems in the real world. In this paper, we show that the sequential structure of the RL problem necessitates new approaches to generalization beyond the well-st udied techniques used in supervised learning. While supervised learning methods can generalize effectively without explicitly accounting for epistemic uncertainty, we show that, perhaps surprisingly, this is not the case in RL. We show that generalization to unseen test conditions from a limited number of training conditions induces implicit partial observability, effectively turning even fully-observed MDPs into POMDPs. Informed by this observation, we recast the problem of generalization in RL as solving the induced partially observed Markov decision process, which we call the epistemic POMDP. We demonstrate the failure modes of algorithms that do not appropriately handle this partial observability, and suggest a simple ensemble-based technique for approximately solving the partially observed problem. Empirically, we demonstrate that our simple algorithm derived from the epistemic POMDP achieves significant gains in generalization over current methods on the Procgen benchmark suite.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

What underlies rapid learning and systematic generalization in humans

86 - Andrew Joohun Nam , James L. McClelland 2021

Despite the groundbreaking successes of neural networks, contemporary models require extensive training with massive datasets and exhibit poor out-of-sample generalization. One proposed solution is to build systematicity and domain-specific constrain ts into the model, echoing the tenets of classical, symbolic cognitive architectures. In this paper, we consider the limitations of this approach by examining human adults ability to learn an abstract reasoning task from a brief instructional tutorial and explanatory feedback for incorrect responses, demonstrating that human learning dynamics and ability to generalize outside the range of the training examples differ drastically from those of a representative neural network model, and that the model is brittle to changes in features not anticipated by its authors. We present further evidence from human data that the ability to consistently solve the puzzles was associated with education, particularly basic mathematics education, and with the ability to provide a reliably identifiable, valid description of the strategy used. We propose that rapid learning and systematic generalization in humans may depend on a gradual, experience-dependent process of learning-to-learn using instructions and explanations to guide the construction of explicit abstract rules that support generalizable inferences.

التعلم الآلي الذكاء الاصطناعي الحساب الرمزي