ﻻ يوجد ملخص باللغة العربية
Event-based cameras are dynamic vision sensors that can provide asynchronous measurements of changes in per-pixel brightness at a microsecond level. This makes them significantly faster than conventional frame-based cameras, and an appealing choice for high-speed navigation. While an interesting sensor modality, this asynchronous data poses a challenge for common machine learning techniques. In this paper, we present an event variational autoencoder for unsupervised representation learning from asynchronous event camera data. We show that it is feasible to learn compact representations from spatiotemporal event data to encode the context. Furthermore, we show that such pretrained representations can be beneficial for navigation, allowing for usage in reinforcement learning instead of end-to-end reward driven perception. We validate this framework of learning visuomotor policies by applying it to an obstacle avoidance scenario in simulation. We show that representations learnt from event data enable training fast control policies that can adapt to different control capacities, and demonstrate a higher degree of robustness than end-to-end learning from event images.
Machines are a long way from robustly solving open-world perception-control tasks, such as first-person view (FPV) aerial navigation. While recent advances in end-to-end Machine Learning, especially Imitation and Reinforcement Learning appear promisi
Dashboard cameras capture a tremendous amount of driving scene video each day. These videos are purposefully coupled with vehicle sensing data, such as from the speedometer and inertial sensors, providing an additional sensing modality for free. In t
Event cameras are activity-driven bio-inspired vision sensors, thereby resulting in advantages such as sparsity,high temporal resolution, low latency, and power consumption. Given the different sensing modality of event camera and high quality of con
How much does having visual priors about the world (e.g. the fact that the world is 3D) assist in learning to perform downstream motor tasks (e.g. delivering a package)? We study this question by integrating a generic perceptual skill set (e.g. a dis
We present a follow-up study on our unified visuomotor neural model for the robotic tasks of identifying, localizing, and grasping a target object in a scene with multiple objects. Our Retinanet-based model enables end-to-end training of visuomotor a