Reinforcement Learning with Chromatic Networks for Compact Architecture Search

116 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xingyou Song

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xingyou Song - Krzysztof Choromanski - Jack Parker-Holder

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present a neural architecture search algorithm to construct compact reinforcement learning (RL) policies, by combining ENAS and ES in a highly scalable and intuitive way. By defining the combinatorial search space of NAS to be the set of different edge-partitionings (colorings) into same-weight classes, we represent compact architectures via efficient learned edge-partitionings. For several RL tasks, we manage to learn colorings translating to effective policies parameterized by as few as $17$ weight parameters, providing >90% compression over vanilla policies and 6x compression over state-of-the-art compact policies based on Toeplitz matrices, while still maintaining good reward. We believe that our work is one of the first attempts to propose a rigorous approach to training structured neural network architectures for RL problems that are of interest especially in mobile robotics with limited storage and computational resources.

قيم البحث

103 - En-Jui Kuo , Yao-Lung L. Fang , Samuel Yen-Chi Chen 2021

Recent advances in quantum computing have drawn considerable attention to building realistic application for and using quantum computers. However, designing a suitable quantum circuit architecture requires expert knowledge. For example, it is non-tri vial to design a quantum gate sequence for generating a particular quantum state with as fewer gates as possible. We propose a quantum architecture search framework with the power of deep reinforcement learning (DRL) to address this challenge. In the proposed framework, the DRL agent can only access the Pauli-$X$, $Y$, $Z$ expectation values and a predefined set of quantum operations for learning the target quantum state, and is optimized by the advantage actor-critic (A2C) and proximal policy optimization (PPO) algorithms. We demonstrate a successful generation of quantum gate sequences for multi-qubit GHZ states without encoding any knowledge of quantum physics in the agent. The design of our framework is rather general and can be employed with other DRL architectures or optimization methods to study gate synthesis and compilation for many quantum states.

فيزياء الكم الذكاء الاصطناعي التعلم الآلي

Adaptive Reinforcement Learning through Evolving Self-Modifying Neural Networks

189 - Samuel Schmidgall 2020

The adaptive learning capabilities seen in biological neural networks are largely a product of the self-modifying behavior emerging from online plastic changes in synaptic connectivity. Current methods in Reinforcement Learning (RL) only adjust to ne w interactions after reflection over a specified time interval, preventing the emergence of online adaptivity. Recent work addressing this by endowing artificial neural networks with neuromodulated plasticity have been shown to improve performance on simple RL tasks trained using backpropagation, but have yet to scale up to larger problems. Here we study the problem of meta-learning in a challenging quadruped domain, where each leg of the quadruped has a chance of becoming unusable, requiring the agent to adapt by continuing locomotion with the remaining limbs. Results demonstrate that agents evolved using self-modifying plastic networks are more capable of adapting to complex meta-learning learning tasks, even outperforming the same network updated using gradient-based algorithms while taking less time to train.

الحوسبة العصبية والتطورية الذكاء الاصطناعي التعلم الآلي

RL-DARTS: Differentiable Architecture Search for Reinforcement Learning

336 - Yingjie Miao , Xingyou Song , Daiyi Peng 2021

We introduce RL-DARTS, one of the first applications of Differentiable Architecture Search (DARTS) in reinforcement learning (RL) to search for convolutional cells, applied to the Procgen benchmark. We outline the initial difficulties of applying neu ral architecture search techniques in RL, and demonstrate that by simply replacing the image encoder with a DARTS supernet, our search method is sample-efficient, requires minimal extra compute resources, and is also compatible with off-policy and on-policy RL algorithms, needing only minor changes in preexisting code. Surprisingly, we find that the supernet can be used as an actor for inference to generate replay data in standard RL training loops, and thus train end-to-end. Throughout this training process, we show that the supernet gradually learns better cells, leading to alternative architectures which can be highly competitive against manually designed policies, but also verify previous design choices for RL policies.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Lineage Evolution Reinforcement Learning

132 - Zeyu Zhang , Guisheng Yin 2020

We propose a general agent population learning system, and on this basis, we propose lineage evolution reinforcement learning algorithm. Lineage evolution reinforcement learning is a kind of derivative algorithm which accords with the general agent p opulation learning system. We take the agents in DQN and its related variants as the basic agents in the population, and add the selection, mutation and crossover modules in the genetic algorithm to the reinforcement learning algorithm. In the process of agent evolution, we refer to the characteristics of natural genetic behavior, add lineage factor to ensure the retention of potential performance of agent, and comprehensively consider the current performance and lineage value when evaluating the performance of agent. Without changing the parameters of the original reinforcement learning algorithm, lineage evolution reinforcement learning can optimize different reinforcement learning algorithms. Our experiments show that the idea of evolution with lineage improves the performance of original reinforcement learning algorithm in some games in Atari 2600.

الحوسبة العصبية والتطورية الذكاء الاصطناعي التعلم الآلي

Evolutionary Architecture Search for Graph Neural Networks

113 - Min Shi , David A.Wilson , Xingquan Zhu 2020

Automated machine learning (AutoML) has seen a resurgence in interest with the boom of deep learning over the past decade. In particular, Neural Architecture Search (NAS) has seen significant attention throughout the AutoML research community, and ha s pushed forward the state-of-the-art in a number of neural models to address grid-like data such as texts and images. However, very litter work has been done about Graph Neural Networks (GNN) learning on unstructured network data. Given the huge number of choices and combinations of components such as aggregator and activation function, determining the suitable GNN structure for a specific problem normally necessitates tremendous expert knowledge and laborious trails. In addition, the slight variation of hyper parameters such as learning rate and dropout rate could dramatically hurt the learning capacity of GNN. In this paper, we propose a novel AutoML framework through the evolution of individual models in a large GNN architecture space involving both neural structures and learning parameters. Instead of optimizing only the model structures with fixed parameter settings as existing work, an alternating evolution process is performed between GNN structures and learning parameters to dynamically find the best fit of each other. To the best of our knowledge, this is the first work to introduce and evaluate evolutionary architecture search for GNN models. Experiments and validations demonstrate that evolutionary NAS is capable of matching existing state-of-the-art reinforcement learning approaches for both the semi-supervised transductive and inductive node representation learning and classification.

الحوسبة العصبية والتطورية