ROME: A Multi-Resource Job Scheduling Framework for Exascale HPC Systems


Abstract in English

High-performance computing (HPC) is undergoing significant changes. Next generation HPC systems are equipped with diverse global and local resources, such as I/O burst buffer resources, memory resources (e.g., on-chip and off-chip RAM, external RAM/NVRA), network resources, and possibly other resources. Job schedulers play a crucial role in efficient use of resources. However, traditional job schedulers are single-objective and fail to efficient use of other resources. In this paper, we propose ROME, a novel multi-dimensional job scheduling framework to explore potential tradeoffs among multiple resources and provides balanced scheduling decision. Our design leverages genetic algorithm as the multi-dimensional optimization engine to generate fast scheduling decision and to support effective resource utilization.

Download