بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Orientation Convolutional Networks for Image Recognition

115 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Guorui Feng

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yalan Qin - Guorui Feng - Hanzhou Wu

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep Convolutional Neural Networks (DCNNs) are capable of obtaining powerful image representations, which have attracted great attentions in image recognition. However, they are limited in modeling orientation transformation by the internal mechanism. In this paper, we develop Orientation Convolution Networks (OCNs) for image recognition based on the proposed Landmark Gabor Filters (LGFs) that the robustness of the learned representation against changed of orientation can be enhanced. By modulating the convolutional filter with LGFs, OCNs can be compatible with any existing deep learning networks. LGFs act as a Gabor filter bank achieved by selecting $ p $ $ left( ll nright) $ representative Gabor filters as andmarks and express the original Gabor filters as sparse linear combinations of these landmarks. Specifically, based on a matrix factorization framework, a flexible integration for the local and the global structure of original Gabor filters by sparsity and low-rank constraints is utilized. With the propogation of the low-rank structure, the corresponding sparsity for representation of original Gabor filter bank can be significantly promoted. Experimental results over several benchmarks demonstrate that our method is less sensitive to the orientation and produce higher performance both in accuracy and cost, compared with the existing state-of-art methods. Besides, our OCNs have few parameters to learn and can significantly reduce the complexity of training network.

قيم البحث

103 - Felipe Petroski Such , Dheeraj Peri , Frank Brockler 2019

Handwritten text recognition is challenging because of the virtually infinite ways a human can write the same message. Our fully convolutional handwriting model takes in a handwriting sample of unknown length and outputs an arbitrary stream of symbol s. Our dual stream architecture uses both local and global context and mitigates the need for heavy preprocessing steps such as symbol alignment correction as well as complex post processing steps such as connectionist temporal classification, dictionary matching or language models. Using over 100 unique symbols, our model is agnostic to Latin-based languages, and is shown to be quite competitive with state of the art dictionary based methods on the popular IAM and RIMES datasets. When a dictionary is known, we further allow a probabilistic character error rate to correct errant word blocks. Finally, we introduce an attention based mechanism which can automatically target variants of handwriting, such as slant, stroke width, or noise.

الرؤية الحاسوبية وتمييز الأنماط

Cloud based Scalable Object Recognition from Video Streams using Orientation Fusion and Convolutional Neural Networks

75 - Muhammad Usman Yaseen , Ashiq Anjum , Giancarlo Fortino 2021

Object recognition from live video streams comes with numerous challenges such as the variation in illumination conditions and poses. Convolutional neural networks (CNNs) have been widely used to perform intelligent visual object recognition. Yet, CN Ns still suffer from severe accuracy degradation, particularly on illumination-variant datasets. To address this problem, we propose a new CNN method based on orientation fusion for visual object recognition. The proposed cloud-based video analytics system pioneers the use of bi-dimensional empirical mode decomposition to split a video frame into intrinsic mode functions (IMFs). We further propose these IMFs to endure Reisz transform to produce monogenic object components, which are in turn used for the training of CNNs. Past works have demonstrated how the object orientation component may be used to pursue accuracy levels as high as 93%. Herein we demonstrate how a feature-fusion strategy of the orientation components leads to further improving visual recognition accuracy to 97%. We also assess the scalability of our method, looking at both the number and the size of the video streams under scrutiny. We carry out extensive experimentation on the publicly available Yale dataset, including also a self generated video datasets, finding significant improvements (both in accuracy and scale), in comparison to AlexNet, LeNet and SE-ResNeXt, which are the three most commonly used deep learning models for visual object recognition and classification.

الرؤية الحاسوبية وتمييز الأنماط

Convolutional Architecture Exploration for Action Recognition and Image Classification

61 - J.T. Turner , David Aha , Leslie Smith 2015

Convolutional Architecture for Fast Feature Encoding (CAFFE) [11] is a software package for the training, classifying, and feature extraction of images. The UCF Sports Action dataset is a widely used machine learning dataset that has 200 videos taken in 720x480 resolution of 9 different sporting activities: diving, golf, swinging, kicking, lifting, horseback riding, running, skateboarding, swinging (various gymnastics), and walking. In this report we report on a caffe feature extraction pipeline of images taken from the videos of the UCF Sports Action dataset. A similar test was performed on overfeat, and results were inferior to caffe. This study is intended to explore the architecture and hyper parameters needed for effective static analysis of action in videos and classification over a variety of image datasets.

الرؤية الحاسوبية وتمييز الأنماط

High-dimensional Convolutional Networks for Geometric Pattern Recognition

215 - Christopher Choy , Junha Lee , Rene Ranftl 2020

Many problems in science and engineering can be formulated in terms of geometric patterns in high-dimensional spaces. We present high-dimensional convolutional networks (ConvNets) for pattern recognition problems that arise in the context of geometri c registration. We first study the effectiveness of convolutional networks in detecting linear subspaces in high-dimensional spaces with up to 32 dimensions: much higher dimensionality than prior applications of ConvNets. We then apply high-dimensional ConvNets to 3D registration under rigid motions and image correspondence estimation. Experiments indicate that our high-dimensional ConvNets outperform prior approaches that relied on deep networks based on global pooling operators.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Hidden Two-Stream Convolutional Networks for Action Recognition

159 - Yi Zhu , Zhenzhong Lan , Shawn Newsam 2017

Analyzing videos of human actions involves understanding the temporal relationships among video frames. State-of-the-art action recognition approaches rely on traditional optical flow estimation methods to pre-compute motion information for CNNs. Suc h a two-stage approach is computationally expensive, storage demanding, and not end-to-end trainable. In this paper, we present a novel CNN architecture that implicitly captures motion information between adjacent frames. We name our approach hidden two-stream CNNs because it only takes raw video frames as input and directly predicts action classes without explicitly computing optical flow. Our end-to-end approach is 10x faster than its two-stage baseline. Experimental results on four challenging action recognition datasets: UCF101, HMDB51, THUMOS14 and ActivityNet v1.2 show that our approach significantly outperforms the previous best real-time approaches.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي الوسائط المتعددة

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشام الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Orientation Convolutional Networks for Image Recognition

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً