ﻻ يوجد ملخص باللغة العربية
We propose a new method to count objects of specific categories that are significantly smaller than the ground sampling distance of a satellite image. This task is hard due to the cluttered nature of scenes where different object categories occur. Target objects can be partially occluded, vary in appearance within the same class and look alike to different categories. Since traditional object detection is infeasible due to the small size of objects with respect to the pixel size, we cast object counting as a density estimation problem. To distinguish objects of different classes, our approach combines density estimation with semantic segmentation in an end-to-end learnable convolutional neural network (CNN). Experiments show that deep semantic density estimation can robustly count objects of various classes in cluttered scenes. Experiments also suggest that we need specific CNN architectures in remote sensing instead of blindly applying existing ones from computer vision.
It is an innate ability for humans to imagine something only according to their impression, without having to memorize all the details of what they have seen. In this work, we would like to demonstrate that a trained convolutional neural network also
In real-world crowd counting applications, the crowd densities vary greatly in spatial and temporal domains. A detection based counting method will estimate crowds accurately in low density scenes, while its reliability in congested areas is downgrad
In this paper, we propose a novel implicit semantic data augmentation (ISDA) approach to complement traditional augmentation techniques like flipping, translation or rotation. Our work is motivated by the intriguing property that deep networks are su
A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world. In this work, we address the problem of visual semantic planning: the task of predicting a sequence of ac
Data augmentation is widely known as a simple yet surprisingly effective technique for regularizing deep networks. Conventional data augmentation schemes, e.g., flipping, translation or rotation, are low-level, data-independent and class-agnostic ope