بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

A Survey of Methods for Low-Power Deep Learning and Computer Vision

175 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Abhinav Goel

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Abhinav Goel - Caleb Tung - Yung-Hsiang Lu

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.

قيم البحث

109 - Sergei Alyamkin , Matthew Ardi , Alexander C. Berg 2019

Computer vision has achieved impressive progress in recent years. Meanwhile, mobile phones have become the primary computing platforms for millions of people. In addition to mobile phones, many autonomous systems rely on visual data for making decisi ons and some of these systems have limited energy (such as unmanned aerial vehicles also called drones and mobile robots). These systems rely on batteries and energy efficiency is critical. This article serves two main purposes: (1) Examine the state-of-the-art for low-power solutions to detect objects in images. Since 2015, the IEEE Annual International Low-Power Image Recognition Challenge (LPIRC) has been held to identify the most energy-efficient computer vision solutions. This article summarizes 2018 winners solutions. (2) Suggest directions for research as well as opportunities for low-power computer vision.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي الأداء

Deep Learning for Vision-based Prediction: A Survey

422 - Amir Rasouli 2020

Vision-based prediction algorithms have a wide range of applications including autonomous driving, surveillance, human-robot interaction, weather prediction. The objective of this paper is to provide an overview of the field in the past five years wi th a particular focus on deep learning approaches. For this purpose, we categorize these algorithms into video prediction, action prediction, trajectory prediction, body motion prediction, and other prediction applications. For each category, we highlight the common architectures, training methods and types of data used. In addition, we discuss the common evaluation metrics and datasets used for vision-based prediction tasks. A database of all the information presented in this survey including, cross-referenced according to papers, datasets and metrics, can be found online at https://github.com/aras62/vision-based-prediction.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي علم الروبوتات

Fashion Meets Computer Vision: A Survey

172 - Wen-Huang Cheng , Sijie Song , Chieh-Yun Chen 2020

Fashion is the way we present ourselves to the world and has become one of the worlds largest industries. Fashion, mainly conveyed by vision, has thus attracted much attention from computer vision researchers in recent years. Given the rapid developm ent, this paper provides a comprehensive survey of more than 200 major fashion-related works covering four main aspects for enabling intelligent fashion: (1) Fashion detection includes landmark detection, fashion parsing, and item retrieval, (2) Fashion analysis contains attribute recognition, style learning, and popularity prediction, (3) Fashion synthesis involves style transfer, pose transformation, and physical simulation, and (4) Fashion recommendation comprises fashion compatibility, outfit matching, and hairstyle suggestion. For each task, the benchmark datasets and the evaluation protocols are summarized. Furthermore, we highlight promising directions for future research.

الرؤية الحاسوبية وتمييز الأنماط

Learning to Resize Images for Computer Vision Tasks

132 - Hossein Talebi , Peyman Milanfar 2021

For all the ways convolutional neural nets have revolutionized computer vision in recent years, one important aspect has received surprisingly little attention: the effect of image size on the accuracy of tasks being trained for. Typically, to be eff icient, the input images are resized to a relatively small spatial resolution (e.g. 224x224), and both training and inference are carried out at this resolution. The actual mechanism for this re-scaling has been an afterthought: Namely, off-the-shelf image resizers such as bilinear and bicubic are commonly used in most machine learning software frameworks. But do these resizers limit the on task performance of the trained networks? The answer is yes. Indeed, we show that the typical linear resizer can be replaced with learned resizers that can substantially improve performance. Importantly, while the classical resizers typically result in better perceptual quality of the downscaled images, our proposed learned resizers do not necessarily give better visual quality, but instead improve task performance. Our learned image resizer is jointly trained with a baseline vision model. This learned CNN-based resizer creates machine friendly visual manipulations that lead to a consistent improvement of the end task metric over the baseline model. Specifically, here we focus on the classification task with the ImageNet dataset, and experiment with four different models to learn resizers adapted to each model. Moreover, we show that the proposed resizer can also be useful for fine-tuning the classification baselines for other vision tasks. To this end, we experiment with three different baselines to develop image quality assessment (IQA) models on the AVA dataset.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Deep Learning for Embodied Vision Navigation: A Survey

168 - Fengda Zhu , Yi Zhu , Xiaodan Liang 2021

Navigation is one of the fundamental features of a autonomous robot. And the ability of long-term navigation with semantic instruction is a `holy grail` goals of intelligent robots. The development of 3D simulation technology provide a large scale of data to simulate the real-world environment. The deep learning proves its ability to robustly learn various embodied navigation tasks. However, deep learning on embodied navigation is still in its infancy due to the unique challenges faced by the navigation exploration and learning from partial observed visual input. Recently, deep learning in embodied navigation has become even thriving, with numerous methods have been proposed to tackle different challenges in this area. To give a promising direction for future research, in this paper, we present a comprehensive review of embodied navigation tasks and the recent progress in deep learning based methods. It includes two major tasks: target-oriented navigation and the instruction-oriented navigation.

علم الروبوتات الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد الوطني الجزائري للبحث الزراعي

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Survey of Methods for Low-Power Deep Learning and Computer Vision

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً