Action Assembly: Sparse Imitation Learning for Text Based Games with Combinatorial Action Spaces

105 0 0.0 ( 0 )

Download Cite

Added by Chen Tessler

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Chen Tessler - Tom Zahavy - Deborah Cohen

Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We propose a computationally efficient algorithm that combines compressed sensing with imitation learning to solve text-based games with combinatorial action spaces. Specifically, we introduce a new compressed sensing algorithm, named IK-OMP, which can be seen as an extension to the Orthogonal Matching Pursuit (OMP). We incorporate IK-OMP into a supervised imitation learning setting and show that the combined approach (Sparse Imitation Learning, Sparse-IL) solves the entire text-based game of Zork1 with an action space of approximately 10 million actions given both perfect and noisy demonstrations.

rate research

Growing Action Spaces

120 - Gregory Farquhar , Laura Gustafson , Zeming Lin 2019

In complex tasks, such as those with large combinatorial action spaces, random exploration may be too inefficient to achieve meaningful learning progress. In this work, we use a curriculum of progressively growing action spaces to accelerate learning. We assume the environment is out of our control, but that the agent may set an internal curriculum by initially restricting its action space. Our approach uses off-policy reinforcement learning to estimate optimal value functions for multiple action spaces simultaneously and efficiently transfers data, value estimates, and state representations from restricted action spaces to the full task. We show the efficacy of our approach in proof-of-concept control tasks and on challenging large-scale StarCraft micromanagement tasks with large, multi-agent action spaces.

Machine Learning Artificial Intelligence Machine Learning

Learning Action-Transferable Policy with Action Embedding

156 - Yu Chen , Yingfeng Chen , Zhipeng Hu 2019

Transfer learning (TL) is a promising way to improve the sample efficiency of reinforcement learning. However, how to efficiently transfer knowledge across tasks with different state-action spaces is investigated at an early stage. Most previous studies only addressed the inconsistency across different state spaces by learning a common feature space, without considering that similar actions in different action spaces of related tasks share similar semantics. In this paper, we propose a method to learning action embeddings by leveraging this idea, and a framework that learns both state embeddings and action embeddings to transfer policy across tasks with different state and action spaces. Our experimental results on various tasks show that the proposed method can not only learn informative action embeddings but accelerate policy learning.

Machine Learning Artificial Intelligence

Learning and Planning in Complex Action Spaces

111 - Thomas Hubert , Julian Schrittwieser , Ioannis Antonoglou andn Mohammadamin Barekatain 2021

Many important real-world problems have action spaces that are high-dimensional, continuous or both, making full enumeration of all possible actions infeasible. Instead, only small subsets of actions can be sampled for the purpose of policy evaluation and improvement. In this paper, we propose a general framework to reason in a principled way about policy evaluation and improvement over such sampled action subsets. This sample-based policy iteration framework can in principle be applied to any reinforcement learning algorithm based upon policy iteration. Concretely, we propose Sampled MuZero, an extension of the MuZero algorithm that is able to learn in domains with arbitrarily complex action spaces by planning over sampled actions. We demonstrate this approach on the classical board game of Go and on two continuous control benchmark domains: DeepMind Control Suite and Real-World RL Suite.

Machine Learning

Safe Deep Reinforcement Learning for Multi-Agent Systems with Continuous Action Spaces

254 - Ziyad Sheebaelhamd , Konstantinos Zisis , Athina Nisioti 2021

Multi-agent control problems constitute an interesting area of application for deep reinforcement learning models with continuous action spaces. Such real-world applications, however, typically come with critical safety constraints that must not be violated. In order to ensure safety, we enhance the well-known multi-agent deep deterministic policy gradient (MADDPG) framework by adding a safety layer to the deep policy network. In particular, we extend the idea of linearizing the single-step transition dynamics, as was done for single-agent systems in Safe DDPG (Dalal et al., 2018), to multi-agent settings. We additionally propose to circumvent infeasibility problems in the action correction step using soft constraints (Kerrigan & Maciejowski, 2000). Results from the theory of exact penalty functions can be used to guarantee constraint satisfaction of the soft constraints under mild assumptions. We empirically find that the soft formulation achieves a dramatic decrease in constraint violations, making safety available even during the learning procedure.

Machine Learning Robotics

Progressive extension of reinforcement learning action dimension for asymmetric assembly tasks

154 - Yuhang Gai , Jiuming Guo , Dan Wu 2021

Reinforcement learning (RL) is always the preferred embodiment to construct the control strategy of complex tasks, like asymmetric assembly tasks. However, the convergence speed of reinforcement learning severely restricts its practical application. In this paper, the convergence is first accelerated by combining RL and compliance control. Then a completely innovative progressive extension of action dimension (PEAD) mechanism is proposed to optimize the convergence of RL algorithms. The PEAD method is verified in DDPG and PPO. The results demonstrate the PEAD method will enhance the data-efficiency and time-efficiency of RL algorithms as well as increase the stable reward, which provides more potential for the application of RL.

Machine Learning Systems and Control Systems and Control

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Action Assembly: Sparse Imitation Learning for Text Based Games with Combinatorial Action Spaces

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions