Bridging Unsupervised and Supervised Depth from Focus via All-in-Focus Supervision

100 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yu-Lun Liu

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ning-Hsu Wang - Ren Wang - Yu-Lun Liu

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Depth estimation is a long-lasting yet important task in computer vision. Most of the previous works try to estimate depth from input images and assume images are all-in-focus (AiF), which is less common in real-world applications. On the other hand, a few works take defocus blur into account and consider it as another cue for depth estimation. In this paper, we propose a method to estimate not only a depth map but an AiF image from a set of images with different focus positions (known as a focal stack). We design a shared architecture to exploit the relationship between depth and AiF estimation. As a result, the proposed method can be trained either supervisedly with ground truth depth, or emph{unsupervisedly} with AiF images as supervisory signals. We show in various experiments that our method outperforms the state-of-the-art methods both quantitatively and qualitatively, and also has higher efficiency in inference time.

قيم البحث

137 - Keyang Zhou , Kailun Yang , Kaiwei Wang 2021

Depth estimation, as a necessary clue to convert 2D images into the 3D space, has been applied in many machine vision areas. However, to achieve an entire surrounding 360-degree geometric sensing, traditional stereo matching algorithms for depth esti mation are limited due to large noise, low accuracy, and strict requirements for multi-camera calibration. In this work, for a unified surrounding perception, we introduce panoramic images to obtain larger field of view. We extend PADENet first appeared in our previous conference work for outdoor scene understanding, to perform panoramic monocular depth estimation with a focus for indoor scenes. At the same time, we improve the training process of the neural network adapted to the characteristics of panoramic images. In addition, we fuse traditional stereo matching algorithm with deep learning methods and further improve the accuracy of depth predictions. With a comprehensive variety of experiments, this research demonstrates the effectiveness of our schemes aiming for indoor scene perception.

الرؤية الحاسوبية وتمييز الأنماط علم الروبوتات

Bridging Gap between Image Pixels and Semantics via Supervision: A Survey

78 - Jiali Duan , C.-C. Jay Kuo 2021

The fact that there exists a gap between low-level features and semantic meanings of images, called the semantic gap, is known for decades. Resolution of the semantic gap is a long standing problem. The semantic gap problem is reviewed and a survey o n recent efforts in bridging the gap is made in this work. Most importantly, we claim that the semantic gap is primarily bridged through supervised learning today. Experiences are drawn from two application domains to illustrate this point: 1) object detection and 2) metric learning for content-based image retrieval (CBIR). To begin with, this paper offers a historical retrospective on supervision, makes a gradual transition to the modern data-driven methodology and introduces commonly used datasets. Then, it summarizes various supervision methods to bridge the semantic gap in the context of object detection and metric learning.

الرؤية الحاسوبية وتمييز الأنماط

Counting with Focus for Free

62 - Zenglin Shi , Pascal Mettes , 2019

This paper aims to count arbitrary objects in images. The leading counting approaches start from point annotations per object from which they construct density maps. Then, their training objective transforms input images to density maps through deep convolutional networks. We posit that the point annotations serve more supervision purposes than just constructing density maps. We introduce ways to repurpose the points for free. First, we propose supervised focus from segmentation, where points are converted into binary maps. The binary maps are combined with a network branch and accompanying loss function to focus on areas of interest. Second, we propose supervised focus from global density, where the ratio of point annotations to image pixels is used in another branch to regularize the overall density estimation. To assist both the density estimation and the focus from segmentation, we also introduce an improved kernel size estimator for the point annotations. Experiments on six datasets show that all our contributions reduce the counting error, regardless of the base network, resulting in state-of-the-art accuracy using only a single network. Finally, we are the first to count on WIDER FACE, allowing us to show the benefits of our approach in handling varying object scales and crowding levels. Code is available at https://github.com/shizenglin/Counting-with-Focus-for-Free

الرؤية الحاسوبية وتمييز الأنماط

Bridging the Gap between Language Model and Reading Comprehension: Unsupervised MRC via Self-Supervision

76 - Ning Bian , Xianpei Han , Bo Chen 2021

Despite recent success in machine reading comprehension (MRC), learning high-quality MRC models still requires large-scale labeled training data, even using strong pre-trained language models (PLMs). The pre-training tasks for PLMs are not question-a nswering or MRC-based tasks, making existing PLMs unable to be directly used for unsupervised MRC. Specifically, MRC aims to spot an accurate answer span from the given document, but PLMs focus on token filling in sentences. In this paper, we propose a new framework for unsupervised MRC. Firstly, we propose to learn to spot answer spans in documents via self-supervised learning, by designing a self-supervision pretext task for MRC - Spotting-MLM. Solving this task requires capturing deep interactions between sentences in documents. Secondly, we apply a simple sentence rewriting strategy in the inference stage to alleviate the expression mismatch between questions and documents. Experiments show that our method achieves a new state-of-the-art performance for unsupervised MRC.

الحساب واللغة

Removing out-of-focus blur from a single image

106 - Guodong Xu , Chaoqiang Liu , Hui Ji 2018

Reproducing an all-in-focus image from an image with defocus regions is of practical value in many applications, eg, digital photography, and robotics. Using the output of some existing defocus map estimator, existing approaches first segment a de-fo cused image into multiple regions blurred by Gaussian kernels with different variance each, and then de-blur each region using the corresponding Gaussian kernel. In this paper, we proposed a blind deconvolution method specifically designed for removing defocus blurring from an image, by providing effective solutions to two critical problems: 1) suppressing the artifacts caused by segmentation error by introducing an additional variable regularized by weighted $ell_0$-norm; and 2) more accurate defocus kernel estimation using non-parametric symmetry and low-rank based constraints on the kernel. The experiments on real datasets showed the advantages of the proposed method over existing ones, thanks to the effective treatments of the two important issues mentioned above during deconvolution.

الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الحواش الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Bridging Unsupervised and Supervised Depth from Focus via All-in-Focus Supervision

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً