أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Zizhen Zhang

MODRL/D-EL: Multiobjective Deep Reinforcement Learning with Evolutionary Learning for Multiobjective Optimization

203 - Yongxin Zhang , Jiahai Wang , Zizhen Zhang 2021

Learning-based heuristics for solving combinatorial optimization problems has recently attracted much academic attention. While most of the existing works only consider the single objective problem with simple constraints, many real-world problems ha ve the multiobjective perspective and contain a rich set of constraints. This paper proposes a multiobjective deep reinforcement learning with evolutionary learning algorithm for a typical complex problem called the multiobjective vehicle routing problem with time windows (MO-VRPTW). In the proposed algorithm, the decomposition strategy is applied to generate subproblems for a set of attention models. The comprehensive context information is introduced to further enhance the attention models. The evolutionary learning is also employed to fine-tune the parameters of the models. The experimental results on MO-VRPTW instances demonstrate the superiority of the proposed algorithm over other learning-based and iterative-based approaches.

الحوسبة العصبية والتطورية الذكاء الاصطناعي

Meta-Learning-based Deep Reinforcement Learning for Multiobjective Optimization Problems

240 - Zizhen Zhang , Zhiyuan Wu , Jiahai Wang 2021

Deep reinforcement learning (DRL) has recently shown its success in tackling complex combinatorial optimization problems. When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and ef ficiently deal with multiple subproblems determined by weight decomposition of objectives. This paper proposes a concise meta-learning-based DRL approach. It first trains a meta-model by meta-learning. The meta-model is fine-tuned with a few update steps to derive submodels for the corresponding subproblems. The Pareto front is built accordingly. The computational experiments on multiobjective traveling salesman problems demonstrate the superiority of our method over most of learning-based and iteration-based approaches.

الذكاء الاصطناعي

MODRL/D-AM: Multiobjective Deep Reinforcement Learning Algorithm Using Decomposition and Attention Model for Multiobjective Optimization

336 - Hong Wu , Jiahai Wang , Zizhen Zhang 2020

Recently, a deep reinforcement learning method is proposed to solve multiobjective optimization problem. In this method, the multiobjective optimization problem is decomposed to a number of single-objective optimization subproblems and all the subpro blems are optimized in a collaborative manner. Each subproblem is modeled with a pointer network and the model is trained with reinforcement learning. However, when pointer network extracts the features of an instance, it ignores the underlying structure information of the input nodes. Thus, this paper proposes a multiobjective deep reinforcement learning method using decomposition and attention model to solve multiobjective optimization problem. In our method, each subproblem is solved by an attention model, which can exploit the structure features as well as node features of input nodes. The experiment results on multiobjective travelling salesman problem show the proposed algorithm achieves better performance compared with the previous method.

الحوسبة العصبية والتطورية التعلم الآلي

A Deep Reinforcement Learning Algorithm Using Dynamic Attention Model for Vehicle Routing Problems

201 - Bo Peng , Jiahai Wang , Zizhen Zhang 2020

Recent researches show that machine learning has the potential to learn better heuristics than the one designed by human for solving combinatorial optimization problems. The deep neural network is used to characterize the input instance for construct ing a feasible solution incrementally. Recently, an attention model is proposed to solve routing problems. In this model, the state of an instance is represented by node features that are fixed over time. However, the fact is, the state of an instance is changed according to the decision that the model made at different construction steps, and the node features should be updated correspondingly. Therefore, this paper presents a dynamic attention model with dynamic encoder-decoder architecture, which enables the model to explore node features dynamically and exploit hidden structure information effectively at different construction steps. This paper focuses on a challenging NP-hard problem, vehicle routing problem. The experiments indicate that our model outperforms the previous methods and also shows a good generalization performance.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Collective Mobile Sequential Recommendation: A Recommender System for Multiple Taxicabs

141 - Tongwen Wu , Zizhen Zhang , Yanzhi Li 2019

Mobile sequential recommendation was originally designed to find a promising route for a single taxicab. Directly applying it for multiple taxicabs may cause an excessive overlap of recommended routes. The multi-taxicab recommendation problem is chal lenging and has been less studied. In this paper, we first formalize a collective mobile sequential recommendation problem based on a classic mathematical model, which characterizes time-varying influence among competing taxicabs. Next, we propose a new evaluation metric for a collection of taxicab routes aimed to minimize the sum of potential travel time. We then develop an efficient algorithm to calculate the metric and design a greedy recommendation method to approximate the solution. Finally, numerical experiments show the superiority of our methods. In trace-driven simulation, the set of routes recommended by our method significantly outperforms those obtained by conventional methods.

بنى وهياكل البيانات والخوارزميات الذكاء الاصطناعي

A Tabu Search Algorithm for the Multi-period Inspector Scheduling Problem

298 - Hu Qin , Zizhen Zhang , Yubin Xie 2014

This paper introduces a multi-period inspector scheduling problem (MPISP), which is a new variant of the multi-trip vehicle routing problem with time windows (VRPTW). In the MPISP, each inspector is scheduled to perform a route in a given multi-perio d planning horizon. At the end of each period, each inspector is not required to return to the depot but has to stay at one of the vertices for recuperation. If the remaining time of the current period is insufficient for an inspector to travel from his/her current vertex $A$ to a certain vertex B, he/she can choose either waiting at vertex A until the start of the next period or traveling to a vertex C that is closer to vertex B. Therefore, the shortest transit time between any vertex pair is affected by the length of the period and the departure time. We first describe an approach of computing the shortest transit time between any pair of vertices with an arbitrary departure time. To solve the MPISP, we then propose several local search operators adapted from classical operators for the VRPTW and integrate them into a tabu search framework. In addition, we present a constrained knapsack model that is able to produce an upper bound for the problem. Finally, we evaluate the effectiveness of our algorithm with extensive experiments based on a set of test instances. Our computational results indicate that our approach generates high-quality solutions.

الذكاء الاصطناعي بنى وهياكل البيانات والخوارزميات

An Enhanced Branch-and-bound Algorithm for the Talent Scheduling Problem

137 - Zizhen Zhang , Hu Qin , Xiaocong Liang 2014

The talent scheduling problem is a simplified version of the real-world film shooting problem, which aims to determine a shooting sequence so as to minimize the total cost of the actors involved. In this article, we first formulate the problem as an integer linear programming model. Next, we devise a branch-and-bound algorithm to solve the problem. The branch-and-bound algorithm is enhanced by several accelerating techniques, including preprocessing, dominance rules and caching search states. Extensive experiments over two sets of benchmark instances suggest that our algorithm is superior to the current best exact algorithm. Finally, the impacts of different parameter settings are disclosed by some additional experiments.

الذكاء الاصطناعي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد