ﻻ يوجد ملخص باللغة العربية
Realistic simulators are critical for training and verifying robotics systems. While most of the contemporary simulators are hand-crafted, a scaleable way to build simulators is to use machine learning to learn how the environment behaves in response to an action, directly from data. In this work, we aim to learn to simulate a dynamic environment directly in pixel-space, by watching unannotated sequences of frames and their associated action pairs. We introduce a novel high-quality neural simulator referred to as DriveGAN that achieves controllability by disentangling different components without supervision. In addition to steering controls, it also includes controls for sampling features of a scene, such as the weather as well as the location of non-player objects. Since DriveGAN is a fully differentiable simulator, it further allows for re-simulation of a given video sequence, offering an agent to drive through a recorded scene again, possibly taking different actions. We train DriveGAN on multiple datasets, including 160 hours of real-world driving data. We showcase that our approach greatly surpasses the performance of previous data-driven simulators, and allows for new features not explored before.
In this paper, we leverage advances in neural networks towards forming a neural rendering for controllable image generation, and thereby bypassing the need for detailed modeling in conventional graphics pipeline. To this end, we present Neural Graphi
Low visual quality has prevented underwater robotic vision from a wide range of applications. Although several algorithms have been developed, real-time and adaptive methods are deficient for real-world tasks. In this paper, we address this difficult
Recently, hashing is widely used in approximate nearest neighbor search for its storage and computational efficiency. Most of the unsupervised hashing methods learn to map images into semantic similarity-preserving hash codes by constructing local se
In object recognition applications, object images usually appear with different quality levels. Practically, it is very important to indicate object image qualities for better application performance, e.g. filtering out low-quality object image frame
The two-stage methods for instance segmentation, e.g. Mask R-CNN, have achieved excellent performance recently. However, the segmented masks are still very coarse due to the downsampling operations in both the feature pyramid and the instance-wise po