Focal Inverse Distance Transform Maps for Crowd Localization and Counting in Dense Crowd


الملخص بالإنكليزية

In this paper, we propose a novel map for dense crowd localization and crowd counting. Most crowd counting methods utilize convolution neural networks (CNN) to regress a density map, achieving significant progress recently. However, these regression-based methods are often unable to provide a precise location for each person, attributed to two crucial reasons: 1) the density map consists of a series of blurry Gaussian blobs, 2) severe overlaps exist in the dense region of the density map. To tackle this issue, we propose a novel Focal Inverse Distance Transform (FIDT) map for crowd localization and counting. Compared with the density maps, the FIDT maps accurately describe the peoples location, without overlap between nearby heads in dense regions. We simultaneously implement crowd localization and counting by regressing the FIDT map. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art localization-based methods in crowd localization tasks, achieving very competitive performance compared with the regression-based methods in counting tasks. In addition, the proposed method presents strong robustness for the negative samples and extremely dense scenes, which further verifies the effectiveness of the FIDT map. The code and models are available at https://github.com/dk-liang/FIDTM.

تحميل البحث