Moving loads such as cars and trains are very useful sources of seismic waves, which can be analyzed to retrieve information on the seismic velocity of subsurface materials using the techniques of ambient noise seismology. This information is valuable for a variety of applications such as geotechnical characterization of the near-surface, seismic hazard evaluation, and groundwater monitoring. However, for such processes to converge quickly, data segments with appropriate noise energy should be selected. Distributed Acoustic Sensing (DAS) is a novel sensing technique that enables acquisition of these data at very high spatial and temporal resolution for tens of kilometers. One major challenge when utilizing the DAS technology is the large volume of data that is produced, thereby presenting a significant Big Data challenge to find regions of useful energy. In this work, we present a highly scalable and efficient approach to process real, complex DAS data by integrating physics knowledge acquired during a data exploration phase followed by deep supervised learning to identify useful coherent surface waves generated by anthropogenic activity, a class of seismic waves that is abundant on these recordings and is useful for geophysical imaging. Data exploration and training were done on 130~Gigabytes (GB) of DAS measurements. Using parallel computing, we were able to do inference on an additional 170~GB of data (or the equivalent of 10 days worth of recordings) in less than 30 minutes. Our method provides interpretable patterns describing the interaction of ground-based human activities with the buried sensors.