Recognition of convolutional neural network based on CUDA Technology

168 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yibin Huang

تاريخ النشر 2015

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yi-bin Huang - Kang Li - Ge Wang

النظم الموزعة والتوازية والحوسبة العنقودية الحوسبة العصبية والتطورية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

For the problem whether Graphic Processing Unit(GPU),the stream processor with high performance of floating-point computing is applicable to neural networks, this paper proposes the parallel recognition algorithm of Convolutional Neural Networks(CNNs).It adopts Compute Unified Device Architecture(CUDA)technology, definite the parallel data structures, and describes the mapping mechanism for computing tasks on CUDA. It compares the parallel recognition algorithm achieved on GPU of GTX200 hardware architecture with the serial algorithm on CPU. It improves speed by nearly 60 times. Result shows that GPU based the stream processor architecture ate more applicable to some related applications about neural networks than CPU.

قيم البحث

215 - Mengyu Chen 2021

CNN model is a popular method for imagery analysis, so it could be utilized to recognize handwritten digits based on MNIST datasets. For higher recognition accuracy, various CNN models with different fully connected layer sizes are exploited to figur e out the relationship between the CNN fully connected layer size and the recognition accuracy. Inspired by previous pruning work, we performed pruning methods of distinctiveness on CNN models and compared the pruning performance with NN models. For better pruning performances on CNN, the effect of angle threshold on the pruning performance was explored. The evaluation results show that: for the fully connected layer size, there is a threshold, so that when the layer size increases, the recognition accuracy grows if the layer size smaller than the threshold, and falls if the layer size larger than the threshold; the performance of pruning performed on CNN is worse than on NN; as pruning angle threshold increases, the fully connected layer size and the recognition accuracy decreases. This paper also shows that for CNN models trained by the MNIST dataset, they are capable of handwritten digit recognition and achieve the highest recognition accuracy with fully connected layer size 400. In addition, for same dataset MNIST, CNN models work better than big, deep, simple NN models in a published paper.

الرؤية الحاسوبية وتمييز الأنماط الحوسبة العصبية والتطورية

Benchmark Tests of Convolutional Neural Network and Graph Convolutional Network on HorovodRunner Enabled Spark Clusters

79 - Jing Pan , Wendao Liu , Jing Zhou 2020

The freedom of fast iterations of distributed deep learning tasks is crucial for smaller companies to gain competitive advantages and market shares from big tech giants. HorovodRunner brings this process to relatively accessible spark clusters. There have been, however, no benchmark tests on HorovodRunner per se, nor specifically graph convolutional network (GCN, hereafter), and very limited scalability benchmark tests on Horovod, the predecessor requiring custom built GPU clusters. For the first time, we show that Databricks HorovodRunner achieves significant lift in scaling efficiency for the convolutional neural network (CNN, hereafter) based tasks on both GPU and CPU clusters, but not the original GCN task. We also implemented the Rectified Adam optimizer for the first time in HorovodRunner.

النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي

Gaussian and exponential lateral connectivity on distributed spiking neural network simulation

413 - Elena Pastorelli , Pier Stanislao Paolucci , Francesco Simula 2018

We measured the impact of long-range exponentially decaying intra-areal lateral connectivity on the scaling and memory occupation of a distributed spiking neural network simulator compared to that of short-range Gaussian decays. While previous studie s adopted short-range connectivity, recent experimental neurosciences studies are pointing out the role of longer-range intra-areal connectivity with implications on neural simulation platforms. Two-dimensional grids of cortical columns composed by up to 11 M point-like spiking neurons with spike frequency adaption were connected by up to 30 G synapses using short- and long-range connectivity models. The MPI processes composing the distributed simulator were run on up to 1024 hardware cores, hosted on a 64 nodes server platform. The hardware platform was a cluster of IBM NX360 M5 16-core compute nodes, each one containing two Intel Xeon Haswell 8-core E5-2630 v3 processors, with a clock of 2.40 G Hz, interconnected through an InfiniBand network, equipped with 4x QDR switches.

النظم الموزعة والتوازية والحوسبة العنقودية الحوسبة العصبية والتطورية الخلايا العصبية والإدراك

Long Short-Term Memory based Convolutional Recurrent Neural Networks for Large Vocabulary Speech Recognition

87 - Xiangang Li , Xihong Wu 2016

Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-art performance on many speech recognition tasks, as they are able to provide the learned dynamically changing contextual window of all sequence histo ry. On the other hand, the convolutional neural networks (CNNs) have brought significant improvements to deep feed-forward neural networks (FFNNs), as they are able to better reduce spectral variation in the input signal. In this paper, a network architecture called as convolutional recurrent neural network (CRNN) is proposed by combining the CNN and LSTM RNN. In the proposed CRNNs, each speech frame, without adjacent context frames, is organized as a number of local feature patches along the frequency axis, and then a LSTM network is performed on each feature patch along the time axis. We train and compare FFNNs, LSTM RNNs and the proposed LSTM CRNNs at various number of configurations. Experimental results show that the LSTM CRNNs can exceed state-of-the-art speech recognition performance.

الحساب واللغة الحوسبة العصبية والتطورية

Deep Convolutional Neural Network Based Facial Expression Recognition in the Wild

126 - Hafiq Anas , Bacha Rehman , Wee Hong Ong 2020

This paper describes the proposed methodology, data used and the results of our participation in the ChallengeTrack 2 (Expr Challenge Track) of the Affective Behavior Analysis in-the-wild (ABAW) Competition 2020. In this competition, we have used a p roposed deep convolutional neural network (CNN) model to perform automatic facial expression recognition (AFER) on the given dataset. Our proposed model has achieved an accuracy of 50.77% and an F1 score of 29.16% on the validation set.

الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشھباء الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Recognition of convolutional neural network based on CUDA Technology

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً