Large-scale ride-sharing systems combine real-time dispatching and routing optimization over a rolling time horizon with a model predictive control (MPC) component that relocates idle vehicles to anticipate the demand. The MPC optimization operates over a longer time horizon to compensate for the inherent myopic nature of the real-time dispatching. These longer time horizons are beneficial for the quality of relocation decisions but increase computational complexity. Consequently, the ride-sharing operators are often forced to use a relatively short time horizon. To address this computational challenge, this paper proposes a hybrid approach that combines machine learning and optimization. The machine-learning component learns the optimal solution to the MPC on the aggregated level to overcome the sparsity and high-dimensionality of the solution. The optimization component transforms the machine-learning prediction back to the original granularity through a tractable transportation model. As a consequence, the original NP-hard MPC problem is reduced to a polynomial time prediction and optimization, which allows the ride-sharing operators to consider a longer time horizon. Experimental results show that the hybrid approach achieves significantly better service quality than the MPC optimization in terms of average rider waiting time, due to its ability to model a longer horizon.