No Arabic abstract
A combined Short-Term Learning (STL) and Long-Term Learning (LTL) approach to solving mobile robot navigation problems is presented and tested in both real and simulated environments. The LTL consists of rapid simulations that use a Genetic Algorithm to derive diverse sets of behaviours. These sets are then transferred to an idiotypic Artificial Immune System (AIS), which forms the STL phase, and the system is said to be seeded. The combined LTL-STL approach is compared with using STL only, and with using a handdesigned controller. In addition, the STL phase is tested when the idiotypic mechanism is turned off. The results provide substantial evidence that the best option is the seeded idiotypic system, i.e. the architecture that merges LTL with an idiotypic AIS for the STL. They also show that structurally different environments can be used for the two phases without compromising transferability
The challenge of robotic reproduction -- making of new robots by recombining two existing ones -- has been recently cracked and physically evolving robot systems have come within reach. Here we address the next big hurdle: producing an adequate brain for a newborn robot. In particular, we address the task of targeted locomotion which is arguably a fundamental skill in any practical implementation. We introduce a controller architecture and a generic learning method to allow a modular robot with an arbitrary shape to learn to walk towards a target and follow this target if it moves. Our approach is validated on three robots, a spider, a gecko, and their offspring, in three real-world scenarios.
Evolution has resulted in highly developed abilities in many natural intelligences to quickly and accurately predict mechanical phenomena. Humans have successfully developed laws of physics to abstract and model such mechanical phenomena. In the context of artificial intelligence, a recent line of work has focused on estimating physical parameters based on sensory data and use them in physical simulators to make long-term predictions. In contrast, we investigate the effectiveness of a single neural network for end-to-end long-term prediction of mechanical phenomena. Based on extensive evaluation, we demonstrate that such networks can outperform alternate approaches having even access to ground-truth physical simulators, especially when some physical parameters are unobserved or not known a-priori. Further, our network outputs a distribution of outcomes to capture the inherent uncertainty in the data. Our approach demonstrates for the first time the possibility of making actionable long-term predictions from sensor data without requiring to explicitly model the underlying physical laws.
We introduce ES-ENAS, a simple yet general evolutionary joint optimization procedure by combining continuous optimization via Evolutionary Strategies (ES) and combinatorial optimization via Efficient NAS (ENAS) in a highly scalable and intuitive way. Our main insight is noticing that ES is already a highly distributed algorithm involving hundreds of forward passes which can not only be used for training neural network weights, but also for jointly training a NAS controller, both in a blackbox fashion. By doing so, we also bridge the gap from NAS research in supervised learning settings to the reinforcement learning scenario through this relatively simple marriage between two different yet common lines of research. We demonstrate the utility and effectiveness of our method over a large search space by training highly combinatorial neural network architectures for RL problems in continuous control, via edge pruning and quantization. We also incorporate a wide variety of popular techniques from modern NAS literature including multiobjective optimization along with various controller methods, to showcase their promise in the RL field and discuss possible extensions.
We consider the setting of an agent with a fixed body interacting with an unknown and uncertain external world. We show that models trained to predict proprioceptive information about the agents body come to represent objects in the external world. In spite of being trained with only internally available signals, these dynamic body models come to represent external objects through the necessity of predicting their effects on the agents own body. That is, the model learns holistic persistent representations of objects in the world, even though the only training signals are body signals. Our dynamics model is able to successfully predict distributions over 132 sensor readings over 100 steps into the future and we demonstrate that even when the body is no longer in contact with an object, the latent variables of the dynamics model continue to represent its shape. We show that active data collection by maximizing the entropy of predictions about the body---touch sensors, proprioception and vestibular information---leads to learning of dynamic models that show superior performance when used for control. We also collect data from a real robotic hand and show that the same models can be used to answer questions about properties of objects in the real world. Videos with qualitative results of our models are available at https://goo.gl/mZuqAV.
The immune system is a complex biological system with a highly distributed, adaptive and self-organising nature. This paper presents an Artificial Immune System (AIS) that exploits some of these characteristics and is applied to the task of film recommendation by Collaborative Filtering (CF). Natural evolution and in particular the immune system have not been designed for classical optimisation. However, for this problem, we are not interested in finding a single optimum. Rather we intend to identify a sub-set of good matches on which recommendations can be based. It is our hypothesis that an AIS built on two central aspects of the biological immune system will be an ideal candidate to achieve this: Antigen-antibody interaction for matching and idiotypic antibody-antibody interaction for diversity. Computational results are presented in support of this conjecture and compared to those found by other CF techniques.