ﻻ يوجد ملخص باللغة العربية
In this paper, we present Retargetable AR, a novel AR framework that yields an AR experience that is aware of scene contexts set in various real environments, achieving natural interaction between the virtual and real worlds. To this end, we characterize scene contexts with relationships among objects in 3D space, not with coordinates transformations. A context assumed by an AR content and a context formed by a real environment where users experience AR are represented as abstract graph representations, i.e. scene graphs. From RGB-D streams, our framework generates a volumetric map in which geometric and semantic information of a scene are integrated. Moreover, using the semantic map, we abstract scene objects as oriented bounding boxes and estimate their orientations. With such a scene representation, our framework constructs, in an online fashion, a 3D scene graph characterizing the context of a real environment for AR. The correspondence between the constructed graph and an AR scene graph denoting the context of AR content provides a semantically registered content arrangement, which facilitates natural interaction between the virtual and real worlds. We performed extensive evaluations on our prototype system through quantitative evaluation of the performance of the oriented bounding box estimation, subjective evaluation of the AR content arrangement based on constructed 3D scene graphs, and an online AR demonstration. The results of these evaluations showed the effectiveness of our framework, demonstrating that it can provide a context-aware AR experience in a variety of real scenes.
Panorama images have a much larger field-of-view thus naturally encode enriched scene context information compared to standard perspective images, which however is not well exploited in the previous scene understanding methods. In this paper, we prop
Scene flow estimation is the task to predict the point-wise 3D displacement vector between two consecutive frames of point clouds, which has important application in fields such as service robots and autonomous driving. Although many previous works h
A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very little data is available -- current datasets cover a small range of sce
Controllable scene synthesis consists of generating 3D information that satisfy underlying specifications. Thereby, these specifications should be abstract, i.e. allowing easy user interaction, whilst providing enough interface for detailed control.
Segmentation-based scene text detection methods have been widely adopted for arbitrary-shaped text detection recently, since they make accurate pixel-level predictions on curved text instances and can facilitate real-time inference without time-consu