No Arabic abstract
This paper proposes a method that combines the style transfer technique and the learned descriptor to enhance the matching performances of underwater sonar images. In the field of underwater vision, sonar is currently the most effective long-distance detection sensor, it has excellent performances in map building and target search tasks. However, the traditional image matching algorithms are all developed based on optical images. In order to solve this contradiction, the style transfer method is used to convert the sonar images into optical styles, and at the same time, the learned descriptor with excellent expressiveness for sonar images matching is introduced. Experiments show that this method significantly enhances the matching quality of sonar images. In addition, it also provides new ideas for the preprocessing of underwater sonar images by using the style transfer approach.
In the field of underwater vision research, image matching between the sonar sensors and optical cameras has always been a challenging problem. Due to the difference in the imaging mechanism between them, which are the gray value, texture, contrast, etc. of the acoustic images and the optical images are also variant in local locations, which makes the traditional matching method based on the optical image invalid. Coupled with the difficulties and high costs of underwater data acquisition, it further affects the research process of acousto-optic data fusion technology. In order to maximize the use of underwater sensor data and promote the development of multi-sensor information fusion (MSIF), this study applies the image attribute transfer method based on deep learning approach to solve the problem of acousto-optic image matching, the core of which is to eliminate the imaging differences between them as much as possible. At the same time, the advanced local feature descriptor is introduced to solve the challenging acousto-optic matching problem. Experimental results show that our proposed method could preprocess acousto-optic images effectively and obtain accurate matching results. Additionally, the method is based on the combination of image depth semantic layer, and it could indirectly display the local feature matching relationship between original image pair, which provides a new solution to the underwater multi-sensor image matching problem.
Binocular stereo vision is an important branch of machine vision, which imitates the human eye and matches the left and right images captured by the camera based on epipolar constraints. The matched disparity map can be calculated according to the camera imaging model to obtain a depth map, and then the depth map is converted to a point cloud image to obtain spatial point coordinates, thereby achieving the purpose of ranging. However, due to the influence of illumination under water, the captured images no longer meet the epipolar constraints, and the changes in imaging models make traditional calibration methods no longer applicable. Therefore, this paper proposes a new underwater real-time calibration method and a matching method based on the best search domain to improve the accuracy of underwater distance measurement using binoculars.
Gram-based and patch-based approaches are two important research lines of image style transfer. Recent diversified Gram-based methods have been able to produce multiple and diverse reasonable solutions for the same content and style inputs. However, as another popular research interest, the diversity of patch-based methods remains challenging due to the stereotyped style swapping process based on nearest patch matching. To resolve this dilemma, in this paper, we dive into the core style swapping process of patch-based style transfer and explore possible ways to diversify it. What stands out is an operation called shifted style normalization (SSN), the most effective and efficient way to empower existing patch-based methods to generate diverse results for arbitrary styles. The key insight is to use an important intuition that neural patches with higher activation values could contribute more to diversity. Theoretical analyses and extensive experiments are conducted to demonstrate the effectiveness of our method, and compared with other possible options and state-of-the-art algorithms, it shows remarkable superiority in both diversity and efficiency.
In this paper, we propose a photorealistic style transfer network to emphasize the natural effect of photorealistic image stylization. In general, distortion of the image content and lacking of details are two typical issues in the style transfer field. To this end, we design a novel framework employing the U-Net structure to maintain the rich spatial clues, with a multi-layer feature aggregation (MFA) method to simultaneously provide the details obtained by the shallow layers in the stylization processing. In particular, an encoder based on the dense block and a decoder form a symmetrical structure of U-Net are jointly staked to realize an effective feature extraction and image reconstruction. Besides, a transfer module based on MFA and adaptive instance normalization (AdaIN) is inserted in the skip connection positions to achieve the stylization. Accordingly, the stylized image possesses the texture of a real photo and preserves rich content details without introducing any mask or post-processing steps. The experimental results on public datasets demonstrate that our method achieves a more faithful structural similarity with a lower style loss, reflecting the effectiveness and merit of our approach.
Imaging sonars have shown better flexibility than optical cameras in underwater localization and navigation for autonomous underwater vehicles (AUVs). However, the sparsity of underwater acoustic features and the loss of elevation angle in sonar frames have imposed degeneracy cases, namely under-constrained or unobservable cases according to optimization-based or EKF-based simultaneous localization and mapping (SLAM). In these cases, the relative ambiguous sensor poses and landmarks cannot be triangulated. To handle this, this paper proposes a robust imaging sonar SLAM approach based on sonar keyframes (KFs) and an elastic sliding window. The degeneracy cases are further analyzed and the triangulation property of 2D landmarks in arbitrary motion has been proved. These degeneracy cases are discriminated and the sonar KFs are selected via saliency criteria to extract and save the informative constraints from previous sonar measurements. Incorporating the inertial measurements, an elastic sliding windowed back-end optimization is proposed to mostly utilize the past salient sonar frames and also restrain the optimization scale. Comparative experiments validate the effectiveness of the proposed method and its robustness to outliers from the wrong data association, even without loop closure.