ﻻ يوجد ملخص باللغة العربية
Despite the effectiveness of Convolutional Neural Networks (CNNs) for image classification, our understanding of the relationship between shape of convolution kernels and learned representations is limited. In this work, we explore and employ the relationship between shape of kernels which define Receptive Fields (RFs) in CNNs for learning of feature representations and image classification. For this purpose, we first propose a feature visualization method for visualization of pixel-wise classification score maps of learned features. Motivated by our experimental results, and observations reported in the literature for modeling of visual systems, we propose a novel design of shape of kernels for learning of representations in CNNs. In the experimental results, we achieved a state-of-the-art classification performance compared to a base CNN model [28] by reducing the number of parameters and computational time of the model using the ILSVRC-2012 dataset [24]. The proposed models also outperform the state-of-the-art models employed on the CIFAR-10/100 datasets [12] for image classification. Additionally, we analyzed the robustness of the proposed method to occlusion for classification of partially occluded images compared with the state-of-the-art methods. Our results indicate the effectiveness of the proposed approach. The code is available in github.com/minogame/caffe-qhconv.
Deep convolutional neural networks have achieved remarkable success in computer vision. However, deep neural networks require large computing resources to achieve high performance. Although depthwise separable convolution can be an efficient module t
Convolutional neural networks (CNNs) have achieved state-of-the-art results on many visual recognition tasks. However, current CNN models still exhibit a poor ability to be invariant to spatial transformations of images. Intuitively, with sufficient
The square kernel is a standard unit for contemporary Convolutional Neural Networks (CNNs), as it fits well on the tensor computation for the convolution operation. However, the receptive field in the human visual system is actually isotropic like a
A technique named Feature Learning from Image Markers (FLIM) was recently proposed to estimate convolutional filters, with no backpropagation, from strokes drawn by a user on very few images (e.g., 1-3) per class, and demonstrated for coconut-tree im
Convolutional neural networks (CNN) have recently achieved state-of-the-art results in various applications. In the case of image recognition, an ideal model has to learn independently of the training data, both local dependencies between the three c