بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Amodal Completion and Size Constancy in Natural Scenes

268 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Abhishek Kar

تاريخ النشر 2015

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Abhishek Kar - Shubham Tulsiani - Jo~ao Carreira

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider the problem of enriching current object detection systems with veridical object sizes and relative depth estimates from a single image. There are several technical challenges to this, such as occlusions, lack of calibration data and the scale ambiguity between object size and distance. These have not been addressed in full generality in previous work. Here we propose to tackle these issues by building upon advances in object recognition and using recently created large-scale datasets. We first introduce the task of amodal bounding box completion, which aims to infer the the full extent of the object instances in the image. We then propose a probabilistic framework for learning category-specific object size distributions from available annotations and leverage these in conjunction with amodal completion to infer veridical sizes in novel images. Finally, we introduce a focal length prediction approach that exploits scene recognition to overcome inherent scaling ambiguities and we demonstrate qualitative results on challenging real-world scenes.

قيم البحث

363 - Jiaxin Lu , Mai Xu , Ren Yang 2018

Memorability measures how easily an image is to be memorized after glancing, which may contribute to designing magazine covers, tourism publicity materials, and so forth. Recent works have shed light on the visual features that make generic images, o bject images or face photographs memorable. However, these methods are not able to effectively predict the memorability of outdoor natural scene images. To overcome this shortcoming of previous works, in this paper, we provide an attempt to answer: what exactly makes outdoor natural scenes memorable. To this end, we first establish a large-scale outdoor natural scene image memorability (LNSIM) database, containing 2,632 outdoor natural scene images with their ground truth memorability scores and the multi-label scene category annotations. Then, similar to previous works, we mine our database to investigate how low-, middle- and high-level handcrafted features affect the memorability of outdoor natural scenes. In particular, we find that the high-level feature of scene category is rather correlated with outdoor natural scene memorability, and the deep features learnt by deep neural network (DNN) are also effective in predicting the memorability scores. Moreover, combining the deep features with the category feature can further boost the performance of memorability prediction. Therefore, we propose an end-to-end DNN based outdoor natural scene memorability (DeepNSM) predictor, which takes advantage of the learned category-related features. Then, the experimental results validate the effectiveness of our DeepNSM model, exceeding the state-of-the-art methods. Finally, we try to understand the reason of the good performance for our DeepNSM model, and also study the cases that our DeepNSM model succeeds or fails to accurately predict the memorability of outdoor natural scenes.

الرؤية الحاسوبية وتمييز الأنماط

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

129 - Fuwen Tan , Paola Cascante-Bonilla , Xiaoxiao Guo 2019

This paper explores the task of interactive image retrieval using natural language queries, where a user progressively provides input queries to refine a set of retrieval results. Moreover, our work explores this problem in the context of complex ima ge scenes containing multiple objects. We propose Drill-down, an effective framework for encoding multiple queries with an efficient compact state representation that significantly extends current methods for single-round image retrieval. We show that using multiple rounds of natural language queries as input can be surprisingly effective to find arbitrarily specific images of complex scenes. Furthermore, we find that existing image datasets with textual captions can provide a surprisingly effective form of weak supervision for this task. We compare our method with existing sequential encoding and embedding networks, demonstrating superior performance on two proposed benchmarks: automatic image retrieval on a simulated scenario that uses region captions as queries, and interactive image retrieval using real queries from human evaluators.

الرؤية الحاسوبية وتمييز الأنماط

Probabilistic Color Constancy

74 - Firas Laakom , Jenni Raitoharju , Alexandros Iosifidis 2020

In this paper, we propose a novel unsupervised color constancy method, called Probabilistic Color Constancy (PCC). We define a framework for estimating the illumination of a scene by weighting the contribution of different image regions using a graph -based representation of the image. To estimate the weight of each (super-)pixel, we rely on two assumptions: (Super-)pixels with similar colors contribute similarly and darker (super-)pixels contribute less. The resulting system has one global optimum solution. The proposed method achieves competitive performance, compared to the state-of-the-art, on INTEL-TAU dataset.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

Detecting Dominant Vanishing Points in Natural Scenes with Application to Composition-Sensitive Image Retrieval

183 - Zihan Zhou , Farshid Farhat , James Z. Wang 2016

Linear perspective is widely used in landscape photography to create the impression of depth on a 2D photo. Automated understanding of linear perspective in landscape photography has several real-world applications, including aesthetics assessment, i mage retrieval, and on-site feedback for photo composition, yet adequate automated understanding has been elusive. We address this problem by detecting the dominant vanishing point and the associated line structures in a photo. However, natural landscape scenes pose great technical challenges because often the inadequate number of strong edges converging to the dominant vanishing point is inadequate. To overcome this difficulty, we propose a novel vanishing point detection method that exploits global structures in the scene via contour detection. We show that our method significantly outperforms state-of-the-art methods on a public ground truth landscape image dataset that we have created. Based on the detection results, we further demonstrate how our approach to linear perspective understanding provides on-site guidance to amateur photographers on their work through a novel viewpoint-specific image retrieval system.

الرؤية الحاسوبية وتمييز الأنماط الوسائط المتعددة

Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image

74 - Andrew Liu , Richard Tucker , Varun Jampani 2020

We introduce the problem of perpetual view generation - long-range generation of novel views corresponding to an arbitrarily long camera trajectory given a single image. This is a challenging problem that goes far beyond the capabilities of current v iew synthesis methods, which quickly degenerate when presented with large camera motions. Methods for video generation also have limited ability to produce long sequences and are often agnostic to scene geometry. We take a hybrid approach that integrates both geometry and image synthesis in an iterative `emph{render}, emph{refine} and emph{repeat} framework, allowing for long-range generation that cover large distances after hundreds of frames. Our approach can be trained from a set of monocular video sequences. We propose a dataset of aerial footage of coastal scenes, and compare our method with recent view synthesis and conditional video generation baselines, showing that it can generate plausible scenes for much longer time horizons over large camera trajectories compared to existing methods. Project page at https://infinite-nature.github.io/.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة العربية الدولية الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Amodal Completion and Size Constancy in Natural Scenes

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً