ترغب بنشر مسار تعليمي؟ اضغط هنا

Exploring Convolutional Networks for End-to-End Visual Servoing

305   0   0.0 ( 0 )
 نشر من قبل Harit Pandya
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Present image based visual servoing approaches rely on extracting hand crafted visual features from an image. Choosing the right set of features is important as it directly affects the performance of any approach. Motivated by recent breakthroughs in performance of data driven methods on recognition and localization tasks, we aim to learn visual feature representations suitable for servoing tasks in unstructured and unknown environments. In this paper, we present an end-to-end learning based approach for visual servoing in diverse scenes where the knowledge of camera parameters and scene geometry is not available a priori. This is achieved by training a convolutional neural network over color images with synchronised camera poses. Through experiments performed in simulation and on a quadrotor, we demonstrate the efficacy and robustness of our approach for a wide range of camera poses in both indoor as well as outdoor environments.

قيم البحث

اقرأ أيضاً

106 - S. Chen , H. Chen , W. Zhou 2020
Visual Simultaneous Localization and Mapping (v-SLAM) and navigation of multirotor Unmanned Aerial Vehicles (UAV) in an unknown environment have grown in popularity for both research and education. However, due to the complex hardware setup, safety p recautions, and battery constraints, extensive physical testing can be expensive and time-consuming. As an alternative solution, simulation tools lower the barrier to carry out the algorithm testing and validation before field trials. In this letter, we customize the ROS-Gazebo-PX4 simulator in deep and provide an end-to-end simulation solution for the UAV v-SLAM and navigation study. A set of localization, mapping, and path planning kits were also integrated into the simulation platform. In our simulation, various aspects, including complex environments and onboard sensors, can simultaneously interact with our navigation framework to achieve specific surveillance missions. In this end-to-end simulation, we achieved click and fly level autonomy UAV navigation. The source code is open to the research community.
Deep learning has been used to demonstrate end-to-end neural network learning for autonomous vehicle control from raw sensory input. While LiDAR sensors provide reliably accurate information, existing end-to-end driving solutions are mainly based on cameras since processing 3D data requires a large memory footprint and computation cost. On the other hand, increasing the robustness of these systems is also critical; however, even estimating the models uncertainty is very challenging due to the cost of sampling-based methods. In this paper, we present an efficient and robust LiDAR-based end-to-end navigation framework. We first introduce Fast-LiDARNet that is based on sparse convolution kernel optimization and hardware-aware model design. We then propose Hybrid Evidential Fusion that directly estimates the uncertainty of the prediction from only a single forward pass and then fuses the control predictions intelligently. We evaluate our system on a full-scale vehicle and demonstrate lane-stable as well as navigation capabilities. In the presence of out-of-distribution events (e.g., sensor failures), our system significantly improves robustness and reduces the number of takeovers in the real world.
The slow acquisition speed of magnetic resonance imaging (MRI) has led to the development of two complementary methods: acquiring multiple views of the anatomy simultaneously (parallel imaging) and acquiring fewer samples than necessary for tradition al signal processing methods (compressed sensing). While the combination of these methods has the potential to allow much faster scan times, reconstruction from such undersampled multi-coil data has remained an open problem. In this paper, we present a new approach to this problem that extends previously proposed variational methods by learning fully end-to-end. Our method obtains new state-of-the-art results on the fastMRI dataset for both brain and knee MRIs.
Robotic vision plays a major role in factory automation to service robot applications. However, the traditional use of frame-based camera sets a limitation on continuous visual feedback due to their low sampling rate and redundant data in real-time i mage processing, especially in the case of high-speed tasks. Event cameras give human-like vision capabilities such as observing the dynamic changes asynchronously at a high temporal resolution ($1mu s$) with low latency and wide dynamic range. In this paper, we present a visual servoing method using an event camera and a switching control strategy to explore, reach and grasp to achieve a manipulation task. We devise three surface layers of active events to directly process stream of events from relative motion. A purely event based approach is adopted to extract corner features, localize them robustly using heat maps and generate virtual features for tracking and alignment. Based on the visual feedback, the motion of the robot is controlled to make the temporal upcoming event features converge to the desired event in spatio-temporal space. The controller switches its strategy based on the sequence of operation to establish a stable grasp. The event based visual servoing (EVBS) method is validated experimentally using a commercial robot manipulator in an eye-in-hand configuration. Experiments prove the effectiveness of the EBVS method to track and grasp objects of different shapes without the need for re-tuning.
Knowledge graph embedding has been an active research topic for knowledge base completion, with progressive improvement from the initial TransE, TransH, DistMult et al to the current state-of-the-art ConvE. ConvE uses 2D convolution over embeddings a nd multiple layers of nonlinear features to model knowledge graphs. The model can be efficiently trained and scalable to large knowledge graphs. However, there is no structure enforcement in the embedding space of ConvE. The recent graph convolutional network (GCN) provides another way of learning graph node embedding by successfully utilizing graph connectivity structure. In this work, we propose a novel end-to-end Structure-Aware Convolutional Network (SACN) that takes the benefit of GCN and ConvE together. SACN consists of an encoder of a weighted graph convolutional network (WGCN), and a decoder of a convolutional network called Conv-TransE. WGCN utilizes knowledge graph node structure, node attributes and edge relation types. It has learnable weights that adapt the amount of information from neighbors used in local aggregation, leading to more accurate embeddings of graph nodes. Node attributes in the graph are represented as additional nodes in the WGCN. The decoder Conv-TransE enables the state-of-the-art ConvE to be translational between entities and relations while keeps the same link prediction performance as ConvE. We demonstrate the effectiveness of the proposed SACN on standard FB15k-237 and WN18RR datasets, and it gives about 10% relative improvement over the state-of-the-art ConvE in terms of HITS@1, HITS@3 and HITS@10.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا