ترغب بنشر مسار تعليمي؟ اضغط هنا

Multi-Emitter MAP-Elites: Improving quality, diversity and convergence speed with heterogeneous sets of emitters

118   0   0.0 ( 0 )
 نشر من قبل Antoine Cully
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English
 تأليف Antoine Cully




اسأل ChatGPT حول البحث

Quality-Diversity (QD) optimisation is a new family of learning algorithms that aims at generating collections of diverse and high-performing solutions. Among those algorithms, the recently introduced Covariance Matrix Adaptation MAP-Elites (CMA-ME) algorithm proposes the concept of emitters, which uses a predefined heuristic to drive the algorithms exploration. This algorithm was shown to outperform MAP-Elites, a popular QD algorithm that has demonstrated promising results in numerous applications. In this paper, we introduce Multi-Emitter MAP-Elites (ME-MAP-Elites), an algorithm that directly extends CMA-ME and improves its quality, diversity and data efficiency. It leverages the diversity of a heterogeneous set of emitters, in which each emitter type improves the optimisation process in different ways. A bandit algorithm dynamically finds the best selection of emitters depending on the current situation. We evaluate the performance of ME-MAP-Elites on six tasks, ranging from standard optimisation problems (in 100 dimensions) to complex locomotion tasks in robotics. Our comparisons against CMA-ME and MAP-Elites show that ME-MAP-Elites is faster at providing collections of solutions that are significantly more diverse and higher performing. Moreover, in cases where no fruitful synergy can be found between the different emitters, ME-MAP-Elites is equivalent to the best of the compared algorithms.

قيم البحث

اقرأ أيضاً

Quality Diversity (QD) algorithms are a recent family of optimization algorithms that search for a large set of diverse but high-performing solutions. In some specific situations, they can solve multiple tasks at once. For instance, they can find the joint positions required for a robotic arm to reach a set of points, which can also be solved by running a classic optimizer for each target point. However, they cannot solve multiple tasks when the fitness needs to be evaluated independently for each task (e.g., optimizing policies to grasp many different objects). In this paper, we propose an extension of the MAP-Elites algorithm, called Multi-task MAP-Elites, that solves multiple tasks when the fitness function depends on the task. We evaluate it on a simulated parameterized planar arm (10-dimensional search space; 5000 tasks) and on a simulated 6-legged robot with legs of different lengths (36-dimensional search space; 2000 tasks). The results show that in both cases our algorithm outperforms the optimization of each task separately with the CMA-ES algorithm.
Quality-Diversity (QD) algorithms, and MAP-Elites (ME) in particular, have proven very useful for a broad range of applications including enabling real robots to recover quickly from joint damage, solving strongly deceptive maze tasks or evolving rob ot morphologies to discover new gaits. However, present implementations of MAP-Elites and other QD algorithms seem to be limited to low-dimensional controllers with far fewer parameters than modern deep neural network models. In this paper, we propose to leverage the efficiency of Evolution Strategies (ES) to scale MAP-Elites to high-dimensional controllers parameterized by large neural networks. We design and evaluate a new hybrid algorithm called MAP-Elites with Evolution Strategies (ME-ES) for post-damage recovery in a difficult high-dimensional control task where traditional ME fails. Additionally, we show that ME-ES performs efficient exploration, on par with state-of-the-art exploration algorithms in high-dimensional control tasks with strongly deceptive rewards.
Quality-Diversity optimisation algorithms enable the evolution of collections of both high-performing and diverse solutions. These collections offer the possibility to quickly adapt and switch from one solution to another in case it is not working as expected. It therefore finds many applications in real-world domain problems such as robotic control. However, QD algorithms, like most optimisation algorithms, are very sensitive to uncertainty on the fitness function, but also on the behavioural descriptors. Yet, such uncertainties are frequent in real-world applications. Few works have explored this issue in the specific case of QD algorithms, and inspired by the literature in Evolutionary Computation, mainly focus on using sampling to approximate the true value of the performances of a solution. However, sampling approaches require a high number of evaluations, which in many applications such as robotics, can quickly become impractical. In this work, we propose Deep-Grid MAP-Elites, a variant of the MAP-Elites algorithm that uses an archive of similar previously encountered solutions to approximate the performance of a solution. We compare our approach to previously explored ones on three noisy tasks: a standard optimisation task, the control of a redundant arm and a simulated Hexapod robot. The experimental results show that this simple approach is significantly more resilient to noise on the behavioural descriptors, while achieving competitive performances in terms of fitness optimisation, and being more sample-efficient than other existing approaches.
Quality-Diversity algorithms refer to a class of evolutionary algorithms designed to find a collection of diverse and high-performing solutions to a given problem. In robotics, such algorithms can be used for generating a collection of controllers co vering most of the possible behaviours of a robot. To do so, these algorithms associate a behavioural descriptor to each of these behaviours. Each behavioural descriptor is used for estimating the novelty of one behaviour compared to the others. In most existing algorithms, the behavioural descriptor needs to be hand-coded, thus requiring prior knowledge about the task to solve. In this paper, we introduce: Autonomous Robots Realising their Abilities, an algorithm that uses a dimensionality reduction technique to automatically learn behavioural descriptors based on raw sensory data. The performance of this algorithm is assessed on three robotic tasks in simulation. The experimental results show that it performs similarly to traditional hand-coded approaches without the requirement to provide any hand-coded behavioural descriptor. In the collection of diverse and high-performing solutions, it also manages to find behaviours that are novel with respect to more features than its hand-coded baselines. Finally, we introduce a variant of the algorithm which is robust to the dimensionality of the behavioural descriptor space.
In modular robotics, modules can be reconfigured to change the morphology of the robot, making it able to adapt for specific tasks. However, optimizing both the body and control is a difficult challenge due to the intricate relationship between fine- tuning control and morphological changes that can invalidate such optimizations. To solve this challenge we compare three different Evolutionary Algorithms on their capacity to optimize morphologies in modular robotics. We compare two objective-based search algorithms, with MAP-Elites. To understand the benefit of diversity we transition the evolved populations into two difficult environments to see if diversity can have an impact on solving complex environments. In addition, we analyse the genealogical ancestry to shed light on the notion of stepping stones as key to enable high performance. The results show that MAP-Elites is capable of evolving the highest performing solutions in addition to generating the largest morphological diversity. For the transition between environments the results show that MAP-Elites is better at regaining performance by promoting morphological diversity. With the analysis of genealogical ancestry we show that MAP-Elites produces more diverse and higher performing stepping stones than the other objective-based search algorithms. Transitioning the populations to more difficult environments show the utility of morphological diversity, while the analysis of stepping stones show a strong correlation between diversity of ancestry and maximum performance on the locomotion task. The paper shows the advantage of promoting diversity for solving a locomotion task in different environments for modular robotics. By showing that the quality and diversity of stepping stones in Evolutionary Algorithms is an important factor for overall performance we have opened up a new area of analysis and results.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا