ﻻ يوجد ملخص باللغة العربية
Inspired by the conclusion that humans choose the visual cortex regions corresponding to the real size of an object to analyze its features when identifying objects in the real world, this paper presents a framework, SizeNet, which is based on both the real sizes and features of objects to solve object recognition problems. SizeNet was used for object recognition experiments on the homemade Rsize dataset, and was compared with the state-of-the-art methods AlexNet, VGG-16, Inception V3, Resnet-18, and DenseNet-121. The results showed that SizeNet provides much higher accuracy rates for object recognition than the other algorithms. SizeNet can solve the two problems of correctly recognizing objects with highly similar features but real sizes that are obviously different from each other, and correctly distinguishing a target object from interference objects whose real sizes are obviously different from the target object. This is because SizeNet recognizes objects based not only on their features, but also on their real size. The real size of an object can help exclude the interference objects categories whose real size ranges do not match the real size of the object, which greatly reduces the objects categories number in the label set used for the downstream object recognition based on object features. SizeNet is of great significance for studying the interpretable computer vision. Our code and dataset will thus be made public.
Convolutional Neural Networks (CNN) have demon- strated its successful applications in computer vision, speech recognition, and natural language processing. For object recog- nition, CNNs might be limited by its strict label requirement and an implic
Advancements in convolutional neural networks (CNNs) have made significant strides toward achieving high performance levels on multiple object recognition tasks. While some approaches utilize information from the entire scene to propose regions of in
Object recognition from live video streams comes with numerous challenges such as the variation in illumination conditions and poses. Convolutional neural networks (CNNs) have been widely used to perform intelligent visual object recognition. Yet, CN
Existing region-based object detectors are limited to regions with fixed box geometry to represent objects, even if those are highly non-rectangular. In this paper we introduce DP-FCN, a deep model for object detection which explicitly adapts to shap
This paper revisits human-object interaction (HOI) recognition at image level without using supervisions of object location and human pose. We name it detection-free HOI recognition, in contrast to the existing detection-supervised approaches which r