Seeing the World in a Bag of Chips

68 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jeong Joon Park

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jeong Joon Park - Aleksander Holynski - Steve Seitz

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We address the dual problems of novel view synthesis and environment reconstruction from hand-held RGBD sensors. Our contributions include 1) modeling highly specular objects, 2) modeling inter-reflections and Fresnel effects, and 3) enabling surface light field reconstruction with the same input needed to reconstruct shape alone. In cases where scene surface has a strong mirror-like material component, we generate highly detailed environment images, revealing room composition, objects, people, buildings, and trees visible through windows. Our approach yields state of the art view synthesis techniques, operates on low dynamic range imagery, and is robust to geometric and calibration errors.

قيم البحث

55 - Alexander Freytag , Johannes Ruhle , Paul Bodesheim 2014

Vector-quantized local features frequently used in bag-of-visual-words approaches are the backbone of popular visual recognition systems due to both their simplicity and their performance. Despite their success, bag-of-words-histograms basically cont ain low-level image statistics (e.g., number of edges of different orientations). The question remains how much visual information is lost in quantization when mapping visual features to code words? To answer this question, we present an in-depth analysis of the effect of local feature quantization on human recognition performance. Our analysis is based on recovering the visual information by inverting quantized local features and presenting these visualizations with different codebook sizes to human observers. Although feature inversion techniques are around for quite a while, to the best of our knowledge, our technique is the first visualizing especially the effect of feature quantization. Thereby, we are now able to compare single steps in common image classification pipelines to human counterparts.

الرؤية الحاسوبية وتمييز الأنماط

Atom chips in the real world: the effects of wire corrugation

198 - Thorsten Schumm 2004

We present a detailed model describing the effects of wire corrugation on the trapping potential experienced by a cloud of atoms above a current carrying micro wire. We calculate the distortion of the current distribution due to corrugation and then derive the corresponding roughness in the magnetic field above the wire. Scaling laws are derived for the roughness as a function of height above a ribbon shaped wire. We also present experimental data on micro wire traps using cold atoms which complement some previously published measurements and which demonstrate that wire corrugation can satisfactorily explain our observations of atom cloud fragmentation above electroplated gold wires. Finally, we present measurements of the corrugation of new wires fabricated by electron beam lithography and evaporation of gold. These wires appear to be substantially smoother than electroplated wires.

الفيزياء الذرية

Seeing What a GAN Cannot Generate

73 - David Bau , Jun-Yan Zhu , Jonas Wulff 2019

Despite the success of Generative Adversarial Networks (GANs), mode collapse remains a serious issue during GAN training. To date, little work has focused on understanding and quantifying which modes have been dropped by a model. In this work, we vis ualize mode collapse at both the distribution level and the instance level. First, we deploy a semantic segmentation network to compare the distribution of segmented objects in the generated images with the target distribution in the training set. Differences in statistics reveal object classes that are omitted by a GAN. Second, given the identified omitted object classes, we visualize the GANs omissions directly. In particular, we compare specific differences between individual photos and their approximate

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather

61 - Mario Bijelic , Tobias Gruber , Fahim Mannan 2019

The fusion of multimodal sensor streams, such as camera, lidar, and radar measurements, plays a critical role in object detection for autonomous vehicles, which base their decision making on these inputs. While existing methods exploit redundant info rmation in good environmental conditions, they fail in adverse weather where the sensory streams can be asymmetrically distorted. These rare edge-case scenarios are not represented in available datasets, and existing fusion architectures are not designed to handle them. To address this challenge we present a novel multimodal dataset acquired in over 10,000km of driving in northern Europe. Although this dataset is the first large multimodal dataset in adverse weather, with 100k labels for lidar, camera, radar, and gated NIR sensors, it does not facilitate training as extreme weather is rare. To this end, we present a deep fusion network for robust fusion without a large corpus of labeled training data covering all asymmetric distortions. Departing from proposal-level fusion, we propose a single-shot model that adaptively fuses features, driven by measurement entropy. We validate the proposed method, trained on clean data, on our extensive validation dataset. Code and data are available here https://github.com/princeton-computational-imaging/SeeingThroughFog.

الرؤية الحاسوبية وتمييز الأنماط

Seeing in the dark with recurrent convolutional neural networks

123 - Till S. Hartmann 2018

Classical convolutional neural networks (cCNNs) are very good at categorizing objects in images. But, unlike human vision which is relatively robust to noise in images, the performance of cCNNs declines quickly as image quality worsens. Here we propo se to use recurrent connections within the convolutional layers to make networks robust against pixel noise such as could arise from imaging at low light levels, and thereby significantly increase their performance when tested with simulated noisy video sequences. We show that cCNNs classify images with high signal to noise ratios (SNRs) well, but are easily outperformed when tested with low SNR images (high noise levels) by convolutional neural networks that have recurrency added to convolutional layers, henceforth referred to as gruCNNs. Addition of Bayes-optimal temporal integration to allow the cCNN to integrate multiple image frames still does not match gruCNN performance. Additionally, we show that at low SNRs, the probabilities predicted by the gruCNN (after calibration) have higher confidence than those predicted by the cCNN. We propose to consider recurrent connections in the early stages of neural networks as a solution to computer vision under imperfect lighting conditions and noisy environments; challenges faced during real-time video streams of autonomous driving at night, during rain or snow, and other non-ideal situations.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الجزيرة الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Seeing the World in a Bag of Chips

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً