Caching of popular content closer to the mobile user can significantly increase overall user experience as well as network efficiency by decongesting backbone network segments in the case of congestion episodes. In order to find the optimal caching locations, many conventional approaches rely on solving a complex optimization problem that suffers from the curse of dimensionality, which may fail to support online decision making. In this paper we propose a framework to amalgamate model based optimization with data driven techniques by transforming an optimization problem to a grayscale image and train a convolutional neural network (CNN) to predict optimal caching location policies. The rationale for the proposed modelling comes from CNNs superiority to capture features in grayscale images reaching human level performance in image recognition problems. The CNN is trained with optimal solutions and numerical investigations reveal that the performance can increase by more than 400% compared to powerful randomized greedy algorithms. To this end, the proposed technique seems as a promising way forward to the holy grail aspect in resource orchestration which is providing high quality decision making in real time.