ترغب بنشر مسار تعليمي؟ اضغط هنا

Light emitted from a source into a scene can undergo complex interactions with scene surfaces of different material types before being reflected. During this transport, every surface reflection is encoded in the properties of the photons that reach t he detector, including time, direction, intensity, wavelength and polarization. Conventional imaging systems capture intensity by integrating over all other dimensions of the light, hiding this rich scene information. Existing methods are capable of untangling these measurements into their spatial and temporal dimensions, fueling geometric scene understanding tasks. However, examining material properties jointly with geometric properties is an open challenge that could enable unprecedented capabilities beyond geometric scene understanding, allowing for material-dependent scene understanding and imaging through complex transport. In this work, we close this gap, and propose a computational light transport imaging method that captures the spatially- and temporally-resolved complete polarimetric response of a scene. Our method hinges on a 7D tensor theory of light transport. We discover low-rank structure in the polarimetric tensor dimension and propose a data-driven rotating ellipsometry method that learns to exploit redundancy of polarimetric structure. We instantiate our theory with two prototypes: spatio-polarimetric imaging and coaxial temporal-polarimetric imaging. This allows us, for the first time, to decompose scene light transport into temporal, spatial, and complete polarimetric dimensions that unveil scene properties hidden to conventional methods. We validate the applicability of our method on diverse tasks, including shape reconstruction with subsurface scattering, seeing through scattering media, untangling multi-bounce light transport, breaking metamerism, and decomposition of crystals.
Recent neural rendering methods have demonstrated accurate view interpolation by predicting volumetric density and color with a neural network. Although such volumetric representations can be supervised on static and dynamic scenes, existing methods implicitly bake the complete scene light transport into a single neural network for a given scene, including surface modeling, bidirectional scattering distribution functions, and indirect lighting effects. In contrast to traditional rendering pipelines, this prohibits changing surface reflectance, illumination, or composing other objects in the scene. In this work, we explicitly model the light transport between scene surfaces and we rely on traditional integration schemes and the rendering equation to reconstruct a scene. The proposed method allows BSDF recovery with unknown light conditions and classic light transports such as pathtracing. By learning decomposed transport with surface representations established in conventional rendering methods, the method naturally facilitates editing shape, reflectance, lighting and scene composition. The method outperforms NeRV for relighting under known lighting conditions, and produces realistic reconstructions for relit and edited scenes. We validate the proposed approach for scene editing, relighting and reflectance estimation learned from synthetic and captured views on a subset of NeRVs datasets.
Adversarial attacks play an essential role in understanding deep neural network predictions and improving their robustness. Existing attack methods aim to deceive convolutional neural network (CNN)-based classifiers by manipulating RGB images that ar e fed directly to the classifiers. However, these approaches typically neglect the influence of the camera optics and image processing pipeline (ISP) that produce the network inputs. ISPs transform RAW measurements to RGB images and traditionally are assumed to preserve adversarial patterns. However, these low-level pipelines can, in fact, destroy, introduce or amplify adversarial patterns that can deceive a downstream detector. As a result, optimized patterns can become adversarial for the classifier after being transformed by a certain camera ISP and optic but not for others. In this work, we examine and develop such an attack that deceives a specific camera ISP while leaving others intact, using the same down-stream classifier. We frame camera-specific attacks as a multi-task optimization problem, relying on a differentiable approximation for the ISP itself. We validate the proposed method using recent state-of-the-art automotive hardware ISPs, achieving 92% fooling rate when attacking a specific ISP. We demonstrate physical optics attacks with 90% fooling rate for a specific camera lenses.
Active stereo cameras that recover depth from structured light captures have become a cornerstone sensor modality for 3D scene reconstruction and understanding tasks across application domains. Existing active stereo cameras project a pseudo-random d ot pattern on object surfaces to extract disparity independently of object texture. Such hand-crafted patterns are designed in isolation from the scene statistics, ambient illumination conditions, and the reconstruction method. In this work, we propose the first method to jointly learn structured illumination and reconstruction, parameterized by a diffractive optical element and a neural network, in an end-to-end fashion. To this end, we introduce a novel differentiable image formation model for active stereo, relying on both wave and geometric optics, and a novel trinocular reconstruction network. The jointly optimized pattern, which we dub Polka Lines, together with the reconstruction network, achieve state-of-the-art active-stereo depth estimates across imaging conditions. We validate the proposed method in simulation and on a hardware prototype, and show that our method outperforms existing active stereo systems.
This work introduces an evaluation benchmark for depth estimation and completion using high-resolution depth measurements with angular resolution of up to 25 (arcsecond), akin to a 50 megapixel camera with per-pixel depth available. Existing datasets , such as the KITTI benchmark, provide only sparse reference measurements with an order of magnitude lower angular resolution - these sparse measurements are treated as ground truth by existing depth estimation methods. We propose an evaluation methodology in four characteristic automotive scenarios recorded in varying weather conditions (day, night, fog, rain). As a result, our benchmark allows us to evaluate the robustness of depth sensing methods in adverse weather and different driving conditions. Using the proposed evaluation data, we demonstrate that current stereo approaches provide significantly more stable depth estimates than monocular methods and lidar completion in adverse weather. Data and code are available at https://github.com/gruberto/PixelAccurateDepthBenchmark.git.
In this work, we address the lack of 3D understanding of generative neural networks by introducing a persistent 3D feature embedding for view synthesis. To this end, we propose DeepVoxels, a learned representation that encodes the view-dependent appe arance of a 3D scene without having to explicitly model its geometry. At its core, our approach is based on a Cartesian 3D grid of persistent embedded features that learn to make use of the underlying 3D scene structure. Our approach combines insights from 3D geometric computer vision with recent advances in learning image-to-image mappings based on adversarial loss functions. DeepVoxels is supervised, without requiring a 3D reconstruction of the scene, using a 2D re-rendering loss and enforces perspective and multi-view geometry in a principled manner. We apply our persistent 3D scene representation to the problem of novel view synthesis demonstrating high-quality results for a variety of challenging scenes.
Active 3D imaging systems have broad applications across disciplines, including biological imaging, remote sensing and robotics. Applications in these domains require fast acquisition times, high timing resolution, and high detection sensitivity. Sin gle-photon avalanche diodes (SPADs) have emerged as one of the most promising detector technologies to achieve all of these requirements. However, these detectors are plagued by measurement distortions known as pileup, which fundamentally limit their precision. In this work, we develop a probabilistic image formation model that accurately models pileup. We devise inverse methods to efficiently and robustly estimate scene depth and reflectance from recorded photon counts using the proposed model along with statistical priors. With this algorithm, we not only demonstrate improvements to timing accuracy by more than an order of magnitude compared to the state-of-the-art, but this approach is also the first to facilitate sub-picosecond-accurate, photon-efficient 3D imaging in practical scenarios where widely-varying photon counts are observed.
Imaging objects obscured by occluders is a significant challenge for many applications. A camera that could see around corners could help improve navigation and mapping capabilities of autonomous vehicles or make search and rescue missions more effec tive. Time-resolved single-photon imaging systems have recently been demonstrated to record optical information of a scene that can lead to an estimation of the shape and reflectance of objects hidden from the line of sight of a camera. However, existing non-line-of-sight (NLOS) reconstruction algorithms have been constrained in the types of light transport effects they model for the hidden scene parts. We introduce a factored NLOS light transport representation that accounts for partial occlusions and surface normals. Based on this model, we develop a factorization approach for inverse time-resolved light transport and demonstrate high-fidelity NLOS reconstructions for challenging scenes both in simulation and with an experimental NLOS imaging system.
A broad class of problems at the core of computational imaging, sensing, and low-level computer vision reduces to the inverse problem of extracting latent images that follow a prior distribution, from measurements taken under a known physical image f ormation model. Traditionally, hand-crafted priors along with iterative optimization methods have been used to solve such problems. In this paper we present unrolled optimization with deep priors, a principled framework for infusing knowledge of the image formation into deep networks that solve inverse problems in imaging, inspired by classical iterative methods. We show that instances of the framework outperform the state-of-the-art by a substantial margin for a wide variety of imaging problems, such as denoising, deblurring, and compressed sensing magnetic resonance imaging (MRI). Moreover, we conduct experiments that explain how the framework is best used and why it outperforms previous methods.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا