No Arabic abstract
Recent work has achieved dense 3D reconstruction with wide-aperture imaging sonar using a stereo pair of orthogonally oriented sonars. This allows each sonar to observe a spatial dimension that the other is missing, without requiring any prior assumptions about scene geometry. However, this is achieved only in a small region with overlapping fields-of-view, leaving large regions of sonar image observations with an unknown elevation angle. Our work aims to achieve large-scale 3D reconstruction more efficiently using this sensor arrangement. We propose dividing the world into semantic classes to exploit the presence of repeating structures in the subsea environment. We use a Bayesian inference framework to build an understanding of each object classs geometry when 3D information is available from the orthogonal sonar fusion system, and when the elevation angle of our returns is unknown, our framework is used to infer unknown 3D structure. We quantitatively validate our method in a simulation and use data collected from a real outdoor littoral environment to demonstrate the efficacy of our framework in the field. Video attachment: https://www.youtube.com/watch?v=WouCrY9eK4o&t=75s
We propose a novel approach to handling the ambiguity in elevation angle associated with the observations of a forward looking multi-beam imaging sonar, and the challenges it poses for performing an accurate 3D reconstruction. We utilize a pair of sonars with orthogonal axes of uncertainty to independently observe the same points in the environment from two different perspectives, and associate these observations. Using these concurrent observations, we can create a dense, fully defined point cloud at every time-step to aid in reconstructing the 3D geometry of underwater scenes. We will evaluate our method in the context of the current state of the art, for which strong assumptions on object geometry limit applicability to generalized 3D scenes. We will discuss results from laboratory tests that quantitatively benchmark our algorithms reconstruction capabilities, and results from a real-world, tidal river basin which qualitatively demonstrate our ability to reconstruct a cluttered field of underwater objects.
Autonomous underwater gliders use buoyancy control to achieve forward propulsion via a sawtooth-like, rise-and-fall trajectory. Because gliders are slow-moving relative to ocean currents, glider control must consider the effect of oceanic flows. In previous work, we proposed a method to control underwater vehicles in the (horizontal) plane by describing such oceanic flows in terms of streamlines, which are the level sets of stream functions. However, the general analytical form of streamlines in 3D is unknown. In this paper, we show how streamline control can be used in 3D environments by assuming a 2.5D model of ocean currents. We provide an efficient algorithm that acts as a steering function for a single rise or dive component of the gliders sawtooth trajectory, integrate this algorithm within a sampling-based motion planning framework to support long-distance path planning, and provide several examples in simulation in comparison with a baseline method. The key to our methods computational efficiency is an elegant dimensionality reduction to a 1D control region. Streamline-based control can be integrated within various sampling-based frameworks and allows for online planning for gliders in complicated oceanic flows.
This paper reports on a dynamic semantic mapping framework that incorporates 3D scene flow measurements into a closed-form Bayesian inference model. Existence of dynamic objects in the environment cause artifacts and traces in current mapping algorithms, leading to an inconsistent map posterior. We leverage state-of-the-art semantic segmentation and 3D flow estimation using deep learning to provide measurements for map inference. We develop a continuous (i.e., can be queried at arbitrary resolution) Bayesian model that propagates the scene with flow and infers a 3D semantic occupancy map with better performance than its static counterpart. Experimental results using publicly available data sets show that the proposed framework generalizes its predecessors and improves over direct measurements from deep neural networks consistently.
Imaging sonars have shown better flexibility than optical cameras in underwater localization and navigation for autonomous underwater vehicles (AUVs). However, the sparsity of underwater acoustic features and the loss of elevation angle in sonar frames have imposed degeneracy cases, namely under-constrained or unobservable cases according to optimization-based or EKF-based simultaneous localization and mapping (SLAM). In these cases, the relative ambiguous sensor poses and landmarks cannot be triangulated. To handle this, this paper proposes a robust imaging sonar SLAM approach based on sonar keyframes (KFs) and an elastic sliding window. The degeneracy cases are further analyzed and the triangulation property of 2D landmarks in arbitrary motion has been proved. These degeneracy cases are discriminated and the sonar KFs are selected via saliency criteria to extract and save the informative constraints from previous sonar measurements. Incorporating the inertial measurements, an elastic sliding windowed back-end optimization is proposed to mostly utilize the past salient sonar frames and also restrain the optimization scale. Comparative experiments validate the effectiveness of the proposed method and its robustness to outliers from the wrong data association, even without loop closure.
Probabilistic 3D map has been applied to object segmentation with multiple camera viewpoints, however, conventional methods lack of real-time efficiency and functionality of multilabel object mapping. In this paper, we propose a method to generate three-dimensional map with multilabel occupancy in real-time. Extending our previous work in which only target label occupancy is mapped, we achieve multilabel object segmentation in a single looking around action. We evaluate our method by testing segmentation accuracy with 39 different objects, and applying it to a manipulation task of multiple objects in the experiments. Our mapping-based method outperforms the conventional projection-based method by 40 - 96% relative (12.6 mean $IU_{3d}$), and robot successfully recognizes (86.9%) and manipulates multiple objects (60.7%) in an environment with heavy occlusions.