ﻻ يوجد ملخص باللغة العربية
Learning-based heuristics for solving combinatorial optimization problems has recently attracted much academic attention. While most of the existing works only consider the single objective problem with simple constraints, many real-world problems have the multiobjective perspective and contain a rich set of constraints. This paper proposes a multiobjective deep reinforcement learning with evolutionary learning algorithm for a typical complex problem called the multiobjective vehicle routing problem with time windows (MO-VRPTW). In the proposed algorithm, the decomposition strategy is applied to generate subproblems for a set of attention models. The comprehensive context information is introduced to further enhance the attention models. The evolutionary learning is also employed to fine-tune the parameters of the models. The experimental results on MO-VRPTW instances demonstrate the superiority of the proposed algorithm over other learning-based and iterative-based approaches.
Recently, a deep reinforcement learning method is proposed to solve multiobjective optimization problem. In this method, the multiobjective optimization problem is decomposed to a number of single-objective optimization subproblems and all the subpro
Deep reinforcement learning (DRL) has recently shown its success in tackling complex combinatorial optimization problems. When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and ef
Subset selection is an important component in evolutionary multiobjective optimization (EMO) algorithms. Clustering, as a classic method to group similar data points together, has been used for subset selection in some fields. However, clustering-bas
Recently, increasing works have proposed to drive evolutionary algorithms using machine learning models. Usually, the performance of such model based evolutionary algorithms is highly dependent on the training qualities of the adopted models. Since i
We study the problem of learning the objective functions or constraints of a multiobjective decision making model, based on a set of sequentially arrived decisions. In particular, these decisions might not be exact and possibly carry measurement nois