ﻻ يوجد ملخص باللغة العربية
In this paper, we propose a novel map for dense crowd localization and crowd counting. Most crowd counting methods utilize convolution neural networks (CNN) to regress a density map, achieving significant progress recently. However, these regression-based methods are often unable to provide a precise location for each person, attributed to two crucial reasons: 1) the density map consists of a series of blurry Gaussian blobs, 2) severe overlaps exist in the dense region of the density map. To tackle this issue, we propose a novel Focal Inverse Distance Transform (FIDT) map for crowd localization and counting. Compared with the density maps, the FIDT maps accurately describe the peoples location, without overlap between nearby heads in dense regions. We simultaneously implement crowd localization and counting by regressing the FIDT map. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art localization-based methods in crowd localization tasks, achieving very competitive performance compared with the regression-based methods in counting tasks. In addition, the proposed method presents strong robustness for the negative samples and extremely dense scenes, which further verifies the effectiveness of the FIDT map. The code and models are available at https://github.com/dk-liang/FIDTM.
Recent works on crowd counting mainly leverage Convolutional Neural Networks (CNNs) to count by regressing density maps, and have achieved great progress. In the density map, each person is represented by a Gaussian blob, and the final count is obtai
In this paper, we address the challenging problem of crowd counting in congested scenes. Specifically, we present Inverse Attention Guided Deep Crowd Counting Network (IA-DCCN) that efficiently infuses segmentation information through an inverse atte
While the performance of crowd counting via deep learning has been improved dramatically in the recent years, it remains an ingrained problem due to cluttered backgrounds and varying scales of people within an image. In this paper, we propose a Shall
In crowd counting, each training image contains multiple people, where each person is annotated by a dot. Existing crowd counting methods need to use a Gaussian to smooth each annotated dot or to estimate the likelihood of every pixel given the annot
State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density. They typically use the same filters over the whole image or over large image patches. Only then do they estimate local scale to compensate