A Component Based Heuristic Search Method with Evolutionary Eliminations

907 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Uwe Aickelin

تاريخ النشر 2009

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jingpeng Li - Uwe Aickelin - Edmund Burke

الذكاء الاصطناعي الحوسبة العصبية والتطورية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Nurse rostering is a complex scheduling problem that affects hospital personnel on a daily basis all over the world. This paper presents a new component-based approach with evolutionary eliminations, for a nurse scheduling problem arising at a major UK hospital. The main idea behind this technique is to decompose a schedule into its components (i.e. the allocated shift pattern of each nurse), and then to implement two evolutionary elimination strategies mimicking natural selection and natural mutation process on these components respectively to iteratively deliver better schedules. The worthiness of all components in the schedule has to be continuously demonstrated in order for them to remain there. This demonstration employs an evaluation function which evaluates how well each component contributes towards the final objective. Two elimination steps are then applied: the first elimination eliminates a number of components that are deemed not worthy to stay in the current schedule; the second elimination may also throw out, with a low level of probability, some worthy components. The eliminated components are replenished with new ones using a set of constructive heuristics using local optimality criteria. Computational results using 52 data instances demonstrate the applicability of the proposed approach in solving real-world problems.

قيم البحث

343 - Jingpeng Li , Uwe Aickelin , Edmund Burke 2008

Nurse rostering is a complex scheduling problem that affects hospital personnel on a daily basis all over the world. This paper presents a new component-based approach with adaptive perturbations, for a nurse scheduling problem arising at a major UK hospital. The main idea behind this technique is to decompose a schedule into its components (i.e. the allocated shift pattern of each nurse), and then mimic a natural evolutionary process on these components to iteratively deliver better schedules. The worthiness of all components in the schedule has to be continuously demonstrated in order for them to remain there. This demonstration employs a dynamic evaluation function which evaluates how well each component contributes towards the final objective. Two perturbation steps are then applied: the first perturbation eliminates a number of components that are deemed not worthy to stay in the current schedule; the second perturbation may also throw out, with a low level of probability, some worthy components. The eliminated components are replenished with new ones using a set of constructive heuristics using local optimality criteria. Computational results using 52 data instances demonstrate the applicability of the proposed approach in solving real-world problems.

الحوسبة العصبية والتطورية الهندسة الحاسوبية، المالية،العلوم

Policy-Guided Heuristic Search with Guarantees

83 - Laurent Orseau , Levi H. S. Lelis 2021

The use of a policy and a heuristic function for guiding search can be quite effective in adversarial problems, as demonstrated by AlphaGo and its successors, which are based on the PUCT search algorithm. While PUCT can also be used to solve single-a gent deterministic problems, it lacks guarantees on its search effort and it can be computationally inefficient in practice. Combining the A* algorithm with a learned heuristic function tends to work better in these domains, but A* and its variants do not use a policy. Moreover, the purpose of using A* is to find solutions of minimum cost, while we seek instead to minimize the search loss (e.g., the number of search steps). LevinTS is guided by a policy and provides guarantees on the number of search steps that relate to the quality of the policy, but it does not make use of a heuristic function. In this work we introduce Policy-guided Heuristic Search (PHS), a novel search algorithm that uses both a heuristic function and a policy and has theoretical guarantees on the search loss that relates to both the quality of the heuristic and of the policy. We show empirically on the sliding-tile puzzle, Sokoban, and a puzzle from the commercial game `The Witness that PHS enables the rapid learning of both a policy and a heuristic function and compares favorably with A*, Weighted A*, Greedy Best-First Search, LevinTS, and PUCT in terms of number of problems solved and search time in all three domains tested.

الذكاء الاصطناعي التعلم الآلي

A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks

94 - Forest Agostinelli , Alexander Shmakov , Stephen McAleer 2021

A* search is an informed search algorithm that uses a heuristic function to guide the order in which nodes are expanded. Since the computation required to expand a node and compute the heuristic values for all of its generated children grows linearly with the size of the action space, A* search can become impractical for problems with large action spaces. This computational burden becomes even more apparent when heuristic functions are learned by general, but computationally expensive, deep neural networks. To address this problem, we introduce DeepCubeAQ, a deep reinforcement learning and search algorithm that builds on the DeepCubeA algorithm and deep Q-networks. DeepCubeAQ learns a heuristic function that, with a single forward pass through a deep neural network, computes the sum of the transition cost and the heuristic value of all of the children of a node without explicitly generating any of the children, eliminating the need for node expansions. DeepCubeAQ then uses a novel variant of A* search, called AQ* search, that uses the deep Q-network to guide search. We use DeepCubeAQ to solve the Rubiks cube when formulated with a large action space that includes 1872 meta-actions and show that this 157-fold increase in the size of the action space incurs less than a 4-fold increase in computation time when performing AQ* search and that AQ* search is orders of magnitude faster than A* search.

الذكاء الاصطناعي التعلم الآلي

Reproducibility in Evolutionary Computation

77 - Manuel Lopez-Iba~nez , Juergenn Branke (University of Warwick 2021

Experimental studies are prevalent in Evolutionary Computation (EC), and concerns about the reproducibility and replicability of such studies have increased in recent times, reflecting similar concerns in other scientific fields. In this article, we discuss, within the context of EC, the different types of reproducibility and suggest a classification that refines the badge system of the Association of Computing Machinery (ACM) adopted by ACM Transactions on Evolutionary Learning and Optimization (https://dlnext.acm.org/journal/telo). We identify cultural and technical obstacles to reproducibility in the EC field. Finally, we provide guidelines and suggest tools that may help to overcome some of these reproducibility obstacles.

الذكاء الاصطناعي الحوسبة العصبية والتطورية التحسين والتحكم

ES-ENAS: Controller-Based Architecture Search for Evolutionary Reinforcement Learning

157 - Xingyou Song , Krzysztof Choromanski , Jack Parker-Holder 2021

We introduce ES-ENAS, a simple yet general evolutionary joint optimization procedure by combining continuous optimization via Evolutionary Strategies (ES) and combinatorial optimization via Efficient NAS (ENAS) in a highly scalable and intuitive way. Our main insight is noticing that ES is already a highly distributed algorithm involving hundreds of forward passes which can not only be used for training neural network weights, but also for jointly training a NAS controller, both in a blackbox fashion. By doing so, we also bridge the gap from NAS research in supervised learning settings to the reinforcement learning scenario through this relatively simple marriage between two different yet common lines of research. We demonstrate the utility and effectiveness of our method over a large search space by training highly combinatorial neural network architectures for RL problems in continuous control, via edge pruning and quantization. We also incorporate a wide variety of popular techniques from modern NAS literature including multiobjective optimization along with various controller methods, to showcase their promise in the RL field and discuss possible extensions.

التعلم الآلي الحوسبة العصبية والتطورية علم الروبوتات