Do you want to publish a course? Click here

Image classification with Deep Convolutional Neural Network Using Tensorflow and Transfer of Learning

تصنيف الصور مع الشبكة العصبية التلافيفية العميقة باستخدام تينسور فلو ونقل التعلم

2121   2   11   0.0 ( 0 )
 Publication date 2020
and research's language is العربية
 Created by Shamra Editor




Ask ChatGPT about the research

The deep learning algorithm has recently achieved a lot of success, especially in the field of computer vision. This research aims to describe the classification method applied to the dataset of multiple types of images (Synthetic Aperture Radar (SAR) images and non-SAR images). In such a classification, transfer learning was used followed by fine-tuning methods. Besides, pre-trained architectures were used on the known image database ImageNet. The model VGG16 was indeed used as a feature extractor and a new classifier was trained based on extracted features.The input data mainly focused on the dataset consist of five classes including the SAR images class (houses) and the non-SAR images classes (Cats, Dogs, Horses, and Humans). The Convolutional Neural Network (CNN) has been chosen as a better option for the training process because it produces a high accuracy. The final accuracy has reached 91.18% in five different classes. The results are discussed in terms of the probability of accuracy for each class in the image classification in percentage. Cats class got 99.6 %, while houses class got 100 %.Other types of classes were with an average score of 90 % and above.


Artificial intelligence review:
Research summary
تهدف هذه الورقة البحثية إلى وصف طريقة تصنيف الصور باستخدام الشبكات العصبية التلافيفية العميقة (CNN) ونقل التعلم باستخدام نموذج VGG16. تم تطبيق هذه الطريقة على مجموعة بيانات تحتوي على خمسة أنواع من الصور: صور الرادار ذي الفجوة المركبة (SAR) وصور غير SAR (القطط، الكلاب، الخيول، والبشر). استخدم الباحثون نموذج VGG16 كمستخرج ميزات، ثم قاموا بتدريب مصنف جديد بناءً على هذه الميزات. أظهرت النتائج أن دقة التصنيف وصلت إلى 91.18%، حيث حصلت فئة القطط على دقة 99.6% وفئة المنازل على دقة 100%. تم تقييم الأداء باستخدام مقاييس مثل الدقة، F1 Score، الاسترجاع، والدقة. تم استخدام مكتبة TensorFlow ولغة البرمجة بايثون لتنفيذ النموذج. استنتج الباحثون أن استخدام نقل التعلم مع الشبكات العصبية التلافيفية يعد طريقة فعالة لتصنيف الصور بدقة عالية ووقت تدريب قصير.
Critical review
دراسة نقدية: على الرغم من أن الورقة تقدم نتائج واعدة في تصنيف الصور باستخدام نقل التعلم والشبكات العصبية التلافيفية، إلا أن هناك بعض النقاط التي يمكن تحسينها. أولاً، لم يتم التطرق بشكل كافٍ إلى كيفية التعامل مع الصور غير المتوازنة في مجموعة البيانات، حيث يمكن أن يؤثر ذلك على دقة النموذج. ثانياً، لم يتم مناقشة تأثير حجم البيانات على أداء النموذج بشكل مفصل، حيث يمكن أن يكون لحجم البيانات دور كبير في تحسين أو تقليل دقة النموذج. بالإضافة إلى ذلك، كان من الممكن تقديم مقارنة مع نماذج أخرى لتوضيح مدى تفوق النموذج المستخدم. وأخيراً، كان من الممكن تقديم تحليل أعمق للأخطاء التي واجهها النموذج لتحسين الأداء في المستقبل.
Questions related to the research
  1. ما هي الفئات الخمس التي تم استخدامها في مجموعة البيانات؟

    الفئات الخمس هي: القطط، الكلاب، الخيول، البشر، والمنازل (صور الرادار ذي الفجوة المركبة SAR).

  2. ما هو النموذج المستخدم كمستخرج ميزات في هذه الدراسة؟

    تم استخدام نموذج VGG16 كمستخرج ميزات.

  3. ما هي دقة التصنيف النهائية التي تم الوصول إليها في هذه الدراسة؟

    تم الوصول إلى دقة تصنيف نهائية بنسبة 91.18%.

  4. ما هي مكتبة البرمجة ولغة البرمجة التي تم استخدامها لتنفيذ النموذج؟

    تم استخدام مكتبة TensorFlow ولغة البرمجة بايثون لتنفيذ النموذج.


References used
No references
rate research

Read More

Machine learning methods for financial document analysis have been focusing mainly on the textual part. However, the numerical parts of these documents are also rich in information content. In order to further analyze the financial text, we should as say the numeric information in depth. In light of this, the purpose of this research is to identify the linking between the target cashtag and the target numeral in financial tweets, which is more challenging than analyzing news and official documents. In this research, we developed a multi model fusion approach which integrates Bidirectional Encoder Representations from Transformers (BERT) and Convolutional Neural Network (CNN). We also encode dependency information behind text into the model to derive semantic latent features. The experimental results show that our model can achieve remarkable performance and outperform comparisons.
Text classifiers are regularly applied to personal texts, leaving users of these classifiers vulnerable to privacy breaches. We propose a solution for privacy-preserving text classification that is based on Convolutional Neural Networks (CNNs) and Se cure Multiparty Computation (MPC). Our method enables the inference of a class label for a personal text in such a way that (1) the owner of the personal text does not have to disclose their text to anyone in an unencrypted manner, and (2) the owner of the text classifier does not have to reveal the trained model parameters to the text owner or to anyone else. To demonstrate the feasibility of our protocol for practical private text classification, we implemented it in the PyTorch-based MPC framework CrypTen, using a well-known additive secret sharing scheme in the honest-but-curious setting. We test the runtime of our privacy-preserving text classifier, which is fast enough to be used in practice.
Building models for realistic natural language tasks requires dealing with long texts and accounting for complicated structural dependencies. Neural-symbolic representations have emerged as a way to combine the reasoning capabilities of symbolic meth ods, with the expressiveness of neural networks. However, most of the existing frameworks for combining neural and symbolic representations have been designed for classic relational learning tasks that work over a universe of symbolic entities and relations. In this paper, we present DRaiL, an open-source declarative framework for specifying deep relational models, designed to support a variety of NLP scenarios. Our framework supports easy integration with expressive language encoders, and provides an interface to study the interactions between representation, inference and learning.
In recent years, the problem of classifying objects in images has increased by using deep learning as a result of the industrial sector requirements. Despite of many algorithms used in this field, such as Deep Learning Neural Network DNN and Convolut ional Neural Network CNN, the proposed systems to address this problem Lack of comprehensive solution to the difficulties of long training time and floating memory during the training process, low rating classification. Convolutional Neural Networks (CNNs), which are the most used algorithms for this task, were a mathematical pattern for analyzing images data. A new deep-traversal network pattern was proposed to solve the above problems. The aim of the research is to demonstrate the performance of the recognition system using CNNs networks on the available memory and training time by adapting appropriate variables for the bypass network. The database used in this research is CIFAR10, which consists of 60000 colorful images belonging to ten categories, as every 6,000 images are for a class of these items. Where there are 50,000 training images and 10,000 test tubes. When tested on a sample of selected images from the CIFAR10 database, the model achieved a rating classification of 98.87%.
We consider the hierarchical representation of documents as graphs and use geometric deep learning to classify them into different categories. While graph neural networks can efficiently handle the variable structure of hierarchical documents using t he permutation invariant message passing operations, we show that we can gain extra performance improvements using our proposed selective graph pooling operation that arises from the fact that some parts of the hierarchy are invariable across different documents. We applied our model to classify clinical trial (CT) protocols into completed and terminated categories. We use bag-of-words based, as well as pre-trained transformer-based embeddings to featurize the graph nodes, achieving f1-scoresaround 0.85 on a publicly available large scale CT registry of around 360K protocols. We further demonstrate how the selective pooling can add insights into the CT termination status prediction. We make the source code and dataset splits accessible.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا