بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Semantic-Aware Depth Super-Resolution in Outdoor Scenes

197 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Miaomiao Liu

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Miaomiao Liu - Mathieu Salzmann - Xuming He

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

While depth sensors are becoming increasingly popular, their spatial resolution often remains limited. Depth super-resolution therefore emerged as a solution to this problem. Despite much progress, state-of-the-art techniques suffer from two drawbacks: (i) they rely on the assumption that intensity edges coincide with depth discontinuities, which, unfortunately, is only true in controlled environments; and (ii) they typically exploit the availability of high-resolution training depth maps, which can often not be acquired in practice due to the sensors limitations. By contrast, here, we introduce an approach to performing depth super-resolution in more challenging conditions, such as in outdoor scenes. To this end, we first propose to exploit semantic information to better constrain the super-resolution process. In particular, we design a co-sparse analysis model that learns filters from joint intensity, depth and semantic information. Furthermore, we show how low-resolution training depth maps can be employed in our learning strategy. We demonstrate the benefits of our approach over state-of-the-art depth super-resolution methods on two outdoor scene datasets.

قيم البحث

اقرأ أيضاً

Perceptual deep depth super-resolution

305 - Oleg Voynov , Alexey Artemov , Vage Egiazarian 2018

RGBD images, combining high-resolution color and lower-resolution depth from various types of depth sensors, are increasingly common. One can significantly improve the resolution of depth maps by taking advantage of color information; deep learning m ethods make combining color and depth information particularly easy. However, fusing these two sources of data may lead to a variety of artifacts. If depth maps are used to reconstruct 3D shapes, e.g., for virtual reality applications, the visual quality of upsampled images is particularly important. The main idea of our approach is to measure the quality of depth map upsampling using renderings of resulting 3D surfaces. We demonstrate that a simple visual appearance-based loss, when used with either a trained CNN or simply a deep prior, yields significantly improved 3D shapes, as measured by a number of existing perceptual metrics. We compare this approach with a number of existing optimization and learning-based techniques.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

DADA: Depth-aware Domain Adaptation in Semantic Segmentation

139 - Tuan-Hung Vu , Himalaya Jain , Maxime Bucher 2019

Unsupervised domain adaptation (UDA) is important for applications where large scale annotation of representative data is challenging. For semantic segmentation in particular, it helps deploy on real target domain data models that are trained on anno tated images from a different source domain, notably a virtual environment. To this end, most previous works consider semantic segmentation as the only mode of supervision for source domain data, while ignoring other, possibly available, information like depth. In this work, we aim at exploiting at best such a privileged information while training the UDA model. We propose a unified depth-aware UDA framework that leverages in several complementary ways the knowledge of dense depth in the source domain. As a result, the performance of the trained semantic segmentation model on the target domain is boosted. Our novel approach indeed achieves state-of-the-art performance on different challenging synthetic-2-real benchmarks.

الرؤية الحاسوبية وتمييز الأنماط

Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency

77 - Seokju Lee , Sunghoon Im , Stephen Lin 2021

We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision. Our technical contributions are three-fold. First, we highlight t he fundamental difference between inverse and forward projection while modeling the individual motion of each rigid object, and propose a geometrically correct projection pipeline using a neural forward projection module. Second, we design a unified instance-aware photometric and geometric consistency loss that holistically imposes self-supervisory signals for every background and object region. Lastly, we introduce a general-purpose auto-annotation scheme using any off-the-shelf instance segmentation and optical flow models to produce video instance segmentation maps that will be utilized as input to our training pipeline. These proposed elements are validated in a detailed ablation study. Through extensive experiments conducted on the KITTI and Cityscapes dataset, our framework is shown to outperform the state-of-the-art depth and motion estimation methods. Our code, dataset, and models are available at https://github.com/SeokjuLee/Insta-DM .

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي علم الروبوتات

Understanding and Predicting the Memorability of Outdoor Natural Scenes

363 - Jiaxin Lu , Mai Xu , Ren Yang 2018

Memorability measures how easily an image is to be memorized after glancing, which may contribute to designing magazine covers, tourism publicity materials, and so forth. Recent works have shed light on the visual features that make generic images, o bject images or face photographs memorable. However, these methods are not able to effectively predict the memorability of outdoor natural scene images. To overcome this shortcoming of previous works, in this paper, we provide an attempt to answer: what exactly makes outdoor natural scenes memorable. To this end, we first establish a large-scale outdoor natural scene image memorability (LNSIM) database, containing 2,632 outdoor natural scene images with their ground truth memorability scores and the multi-label scene category annotations. Then, similar to previous works, we mine our database to investigate how low-, middle- and high-level handcrafted features affect the memorability of outdoor natural scenes. In particular, we find that the high-level feature of scene category is rather correlated with outdoor natural scene memorability, and the deep features learnt by deep neural network (DNN) are also effective in predicting the memorability scores. Moreover, combining the deep features with the category feature can further boost the performance of memorability prediction. Therefore, we propose an end-to-end DNN based outdoor natural scene memorability (DeepNSM) predictor, which takes advantage of the learned category-related features. Then, the experimental results validate the effectiveness of our DeepNSM model, exceeding the state-of-the-art methods. Finally, we try to understand the reason of the good performance for our DeepNSM model, and also study the cases that our DeepNSM model succeeds or fails to accurately predict the memorability of outdoor natural scenes.

الرؤية الحاسوبية وتمييز الأنماط

TextSR: Content-Aware Text Super-Resolution Guided by Recognition

102 - Wenjia Wang , Enze Xie , Peize Sun 2019

Scene text recognition has witnessed rapid development with the advance of convolutional neural networks. Nonetheless, most of the previous methods may not work well in recognizing text with low resolution which is often seen in natural scene images. An intuitive solution is to introduce super-resolution techniques as pre-processing. However, conventional super-resolution methods in the literature mainly focus on reconstructing the detailed texture of natural images, which typically do not work well for text due to the unique characteristics of text. To tackle these problems, in this work, we propose a content-aware text super-resolution network to generate the information desired for text recognition. In particular, we design an end-to-end network that can perform super-resolution and text recognition simultaneously. Different from previous super-resolution methods, we use the loss of text recognition as the Text Perceptual Loss to guide the training of the super-resolution network, and thus it pays more attention to the text content, rather than the irrelevant background area. Extensive experiments on several challenging benchmarks demonstrate the effectiveness of our proposed method in restoring a sharp high-resolution image from a small blurred one, and show that the recognition performance clearly boosts up the performance of text recognizer. To our knowledge, this is the first work focusing on text super-resolution. Code will be released in https://github.com/xieenze/TextSR.

الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة بابل

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Semantic-Aware Depth Super-Resolution in Outdoor Scenes

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً