Bilevel Learning Model Towards Industrial Scheduling


Abstract in English

Automatic industrial scheduling, aiming at optimizing the sequence of jobs over limited resources, is widely needed in manufacturing industries. However, existing scheduling systems heavily rely on heuristic algorithms, which either generate ineffective solutions or compute inefficiently when job scale increases. Thus, it is of great importance to develop new large-scale algorithms that are not only efficient and effective, but also capable of satisfying complex constraints in practice. In this paper, we propose a Bilevel Deep reinforcement learning Scheduler, textit{BDS}, in which the higher level is responsible for exploring an initial global sequence, whereas the lower level is aiming at exploitation for partial sequence refinements, and the two levels are connected by a sliding-window sampling mechanism. In the implementation, a Double Deep Q Network (DDQN) is used in the upper level and Graph Pointer Network (GPN) lies within the lower level. After the theoretical guarantee for the convergence of BDS, we evaluate it in an industrial automatic warehouse scenario, with job number up to $5000$ in each production line. It is shown that our proposed BDS significantly outperforms two most used heuristics, three strong deep networks, and another bilevel baseline approach. In particular, compared with the most used greedy-based heuristic algorithm in real world which takes nearly an hour, our BDS can decrease the makespan by 27.5%, 28.6% and 22.1% for 3 largest datasets respectively, with computational time less than 200 seconds.

Download