بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Probabilistic Global Scale Estimation for MonoSLAM Based on Generic Object Detection

58 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Edgar Sucar

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Edgar Sucar - Jean-Bernard Hayet

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper proposes a novel method to estimate the global scale of a 3D reconstructed model within a Kalman filtering-based monocular SLAM algorithm. Our Bayesian framework integrates height priors over the detected objects belonging to a set of broad predefined classes, based on recent advances in fast generic object detection. Each observation is produced on single frames, so that we do not need a data association process along video frames. This is because we associate the height priors with the image region sizes at image places where map features projections fall within the object detection regions. We present very promising results of this approach obtained on several experiments with different object classes.

قيم البحث

64 - Edgar Sucar , Jean-Bernard Hayet 2017

This work proposes a new, online algorithm for estimating the local scale correction to apply to the output of a monocular SLAM system and obtain an as faithful as possible metric reconstruction of the 3D map and of the camera trajectory. Within a Ba yesian framework, it integrates observations from a deep-learning based generic object detector and a prior on the evolution of the scale drift. For each observation class, a predefined prior on the heights of the class objects is used. This allows to define the observations likelihood. Due to the scale drift inherent to monocular SLAM systems, we integrate a rough model on the dynamics of scale drift. Quantitative evaluations of the system are presented on the KITTI dataset, and compared with different approaches. The results show a superior performance of our proposal in terms of relative translational error when compared to other monocular systems.

علم الروبوتات

Web-Scale Generic Object Detection at Microsoft Bing

130 - Stephen Xi Chen , Saurajit Mukherjee , Unmesh Phadke 2021

In this paper, we present Generic Object Detection (GenOD), one of the largest object detection systems deployed to a web-scale general visual search engine that can detect over 900 categories for all Microsoft Bing Visual Search queries in near real -time. It acts as a fundamental visual query understanding service that provides object-centric information and shows gains in multiple production scenarios, improving upon domain-specific models. We discuss the challenges of collecting data, training, deploying and updating such a large-scale object detection model with multiple dependencies. We discuss a data collection pipeline that reduces per-bounding box labeling cost by 81.5% and latency by 61.2% while improving on annotation quality. We show that GenOD can improve weighted average precision by over 20% compared to multiple domain-specific models. We also improve the model update agility by nearly 2 times with the proposed disjoint detector training compared to joint fine-tuning. Finally we demonstrate how GenOD benefits visual search applications by significantly improving object-level search relevance by 54.9% and user engagement by 59.9%.

الرؤية الحاسوبية وتمييز الأنماط

Deep Mixture Density Network for Probabilistic Object Detection

112 - Yihui He , Jianren Wang 2019

Mistakes/uncertainties in object detection could lead to catastrophes when deploying robots in the real world. In this paper, we measure the uncertainties of object localization to minimize this kind of risk. Uncertainties emerge upon challenging cas es like occlusion. The bounding box borders of an occluded object can have multiple plausible configurations. We propose a deep multivariate mixture of Gaussians model for probabilistic object detection. The covariances help to learn the relationship between the borders, and the mixture components potentially learn different configurations of an occluded part. Quantitatively, our model improves the AP of the baselines by 3.9% and 1.4% on CrowdHuman and MS-COCO respectively with almost no computational or memory overhead. Qualitatively, our model enjoys explainability since the resulting covariance matrices and the mixture components help measure uncertainties.

الرؤية الحاسوبية وتمييز الأنماط

Multi-scale Edge-based U-shape Network for Salient Object Detection

145 - Han Sun , Yetong Bian , Ningzhong Liu 2021

Deep-learning based salient object detection methods achieve great improvements. However, there are still problems existing in the predictions, such as blurry boundary and inaccurate location, which is mainly caused by inadequate feature extraction a nd integration. In this paper, we propose a Multi-scale Edge-based U-shape Network (MEUN) to integrate various features at different scales to achieve better performance. To extract more useful information for boundary prediction, U-shape Edge Network modules are embedded in each decoder units. Besides, the additional down-sampling module alleviates the location inaccuracy. Experimental results on four benchmark datasets demonstrate the validity and reliability of the proposed method. Multi-scale Edge based U-shape Network also shows its superiority when compared with 15 state-of-the-art salient object detection methods.

الرؤية الحاسوبية وتمييز الأنماط

Learning Object Scale With Click Supervision for Object Detection

83 - Liao Zhang , Yan Yan , Lin Cheng 2020

Weakly-supervised object detection has recently attracted increasing attention since it only requires image-levelannotations. However, the performance obtained by existingmethods is still far from being satisfactory compared with fully-supervised obj ect detection methods. To achieve a good trade-off between annotation cost and object detection performance,we propose a simple yet effective method which incorporatesCNN visualization with click supervision to generate the pseudoground-truths (i.e., bounding boxes). These pseudo ground-truthscan be used to train a fully-supervised detector. To estimatethe object scale, we firstly adopt a proposal selection algorithmto preserve high-quality proposals, and then generate ClassActivation Maps (CAMs) for these preserved proposals by theproposed CNN visualization algorithm called Spatial AttentionCAM. Finally, we fuse these CAMs together to generate pseudoground-truths and train a fully-supervised object detector withthese ground-truths. Experimental results on the PASCAL VOC2007 and VOC 2012 datasets show that the proposed methodcan obtain much higher accuracy for estimating the object scale,compared with the state-of-the-art image-level based methodsand the center-click based method

الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة العربية الخاصة للعلوم والتكنولوجيا

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Probabilistic Global Scale Estimation for MonoSLAM Based on Generic Object Detection

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً