ﻻ يوجد ملخص باللغة العربية
In recent years, with the progress of deep learning technologies, crowd counting has been rapidly developed. In this work, we propose a simple yet effective crowd counting framework that is able to achieve the state-of-the-art performance on various crowded scenes. In particular, we first introduce a perspective-aware density map generation method that is able to produce ground-truth density maps from point annotations to train crowd counting model to accomplish superior performance than prior density map generation techniques. Besides, leveraging our density map generation method, we propose an iterative distillation algorithm to progressively enhance our model with identical network structures, without significantly sacrificing the dimension of the output density maps. In experiments, we demonstrate that, with our simple convolutional neural network architecture strengthened by our proposed training algorithm, our model is able to outperform or be comparable with the state-of-the-art methods. Furthermore, we also evaluate our density map generation approach and distillation algorithm in ablation studies.
Crowd counting from unconstrained scene images is a crucial task in many real-world applications like urban surveillance and management, but it is greatly challenged by the cameras perspective that causes huge appearance variations in peoples scales
Crowd estimation is a very challenging problem. The most recent study tries to exploit auditory information to aid the visual models, however, the performance is limited due to the lack of an effective approach for feature extraction and integration.
State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density. They typically use the same filters over the whole image or over large image patches. Only then do they estimate local scale to compensate
While the performance of crowd counting via deep learning has been improved dramatically in the recent years, it remains an ingrained problem due to cluttered backgrounds and varying scales of people within an image. In this paper, we propose a Shall
Significant progress on the crowd counting problem has been achieved by integrating larger context into convolutional neural networks (CNNs). This indicates that global scene context is essential, despite the seemingly bottom-up nature of the problem