ﻻ يوجد ملخص باللغة العربية
Deep Convolutional Neural Networks (DCNNs) are capable of obtaining powerful image representations, which have attracted great attentions in image recognition. However, they are limited in modeling orientation transformation by the internal mechanism. In this paper, we develop Orientation Convolution Networks (OCNs) for image recognition based on the proposed Landmark Gabor Filters (LGFs) that the robustness of the learned representation against changed of orientation can be enhanced. By modulating the convolutional filter with LGFs, OCNs can be compatible with any existing deep learning networks. LGFs act as a Gabor filter bank achieved by selecting $ p $ $ left( ll nright) $ representative Gabor filters as andmarks and express the original Gabor filters as sparse linear combinations of these landmarks. Specifically, based on a matrix factorization framework, a flexible integration for the local and the global structure of original Gabor filters by sparsity and low-rank constraints is utilized. With the propogation of the low-rank structure, the corresponding sparsity for representation of original Gabor filter bank can be significantly promoted. Experimental results over several benchmarks demonstrate that our method is less sensitive to the orientation and produce higher performance both in accuracy and cost, compared with the existing state-of-art methods. Besides, our OCNs have few parameters to learn and can significantly reduce the complexity of training network.
Handwritten text recognition is challenging because of the virtually infinite ways a human can write the same message. Our fully convolutional handwriting model takes in a handwriting sample of unknown length and outputs an arbitrary stream of symbol
Object recognition from live video streams comes with numerous challenges such as the variation in illumination conditions and poses. Convolutional neural networks (CNNs) have been widely used to perform intelligent visual object recognition. Yet, CN
Convolutional Architecture for Fast Feature Encoding (CAFFE) [11] is a software package for the training, classifying, and feature extraction of images. The UCF Sports Action dataset is a widely used machine learning dataset that has 200 videos taken
Many problems in science and engineering can be formulated in terms of geometric patterns in high-dimensional spaces. We present high-dimensional convolutional networks (ConvNets) for pattern recognition problems that arise in the context of geometri
Analyzing videos of human actions involves understanding the temporal relationships among video frames. State-of-the-art action recognition approaches rely on traditional optical flow estimation methods to pre-compute motion information for CNNs. Suc