In an extra-large scale MIMO (XL-MIMO) system, the antenna arrays have a large physical size that goes beyond the dimensions in traditional MIMO systems. Because of this large dimensionality, the optimization of an XL-MIMO system leads to solutions with prohibitive complexity when relying on conventional optimization tools. In this paper, we propose a design based on machine learning for the downlink of a multi-user setting with linear pre-processing, where the goal is to select a limited mapping area per user, i.e. a small portion of the array that contains the beamforming energy to the user. We refer to this selection as spatial user mapping (SUM). Our solution relies on learning using deep convolutional neural networks with a distributed architecture that is built to manage the large system dimension. This architecture contains one network per user where all the networks work in parallel and exploit specific non-stationary properties of the channels along the array. Our results show that, once the parallel networks are trained, they provide the optimal SUM solution in more than $80%$ of the instances, resulting in a negligible sum-rate loss compared to a system using the optimal SUM solution while providing an insightful approach to rethink these kinds of problems that have no closed-form solution.