ANSAC: Adaptive Non-minimal Sample and Consensus

217 0 0.0 ( 0 )

Download Cite

Added by Victor Fragoso

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Victor Fragoso - Chris Sweeney - Pradeep Sen

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

While RANSAC-based methods are robust to incorrect image correspondences (outliers), their hypothesis generators are not robust to correct image correspondences (inliers) with positional error (noise). This slows down their convergence because hypotheses drawn from a minimal set of noisy inliers can deviate significantly from the optimal model. This work addresses this problem by introducing ANSAC, a RANSAC-based estimator that accounts for noise by adaptively using more than the minimal number of correspondences required to generate a hypothesis. ANSAC estimates the inlier ratio (the fraction of correct correspondences) of several ranked subsets of candidate correspondences and generates hypotheses from them. Its hypothesis-generation mechanism prioritizes the use of subsets with high inlier ratio to generate high-quality hypotheses. ANSAC uses an early termination criterion that keeps track of the inlier ratio history and terminates when it has not changed significantly for a period of time. The experiments show that ANSAC finds good homography and fundamental matrix estimates in a few iterations, consistently outperforming state-of-the-art methods.

rate research

More Informed Random Sample Consensus

442 - Guoxiang Zhang , YangQuan Chen 2020

Random sample consensus (RANSAC) is a robust model-fitting algorithm. It is widely used in many fields including image-stitching and point cloud registration. In RANSAC, data is uniformly sampled for hypothesis generation. However, this uniform sampling strategy does not fully utilize all the information on many problems. In this paper, we propose a method that samples data with a L{e}vy distribution together with a data sorting algorithm. In the hypothesis sampling step of the proposed method, data is sorted with a sorting algorithm we proposed, which sorts data based on the likelihood of a data point being in the inlier set. Then, hypotheses are sampled from the sorted data with L{e}vy distribution. The proposed method is evaluated on both simulation and real-world public datasets. Our method shows better results compared with the uniform baseline method.

Robotics Computer Vision and Pattern Recognition

Simultaneous Consensus Maximization and Model Fitting

95 - Fei Wen , Hewen Wei , Yipeng Liu 2020

Maximum consensus (MC) robust fitting is a fundamental problem in low-level vision to process raw-data. Typically, it firstly finds a consensus set of inliers and then fits a model on the consensus set. This work proposes a new formulation to achieve simultaneous maximum consensus and model estimation (MCME), which has two significant features compared with traditional MC robust fitting. First, it takes fitting residual into account in finding inliers, hence its lowest achievable residual in model fitting is lower than that of MC robust fitting. Second, it has an unconstrained formulation involving binary variables, which facilitates the use of the effective semidefinite relaxation (SDR) method to handle the underlying challenging combinatorial optimization problem. Though still nonconvex after SDR, it becomes biconvex in some applications, for which we use an alternating minimization algorithm to solve. Further, the sparsity of the problem is exploited in combination with low-rank factorization to develop an efficient algorithm. Experiments show that MCME significantly outperforms RANSAC and deterministic approximate MC methods at high outlier ratios. Besides, in rotation and Euclidean registration, it also compares favorably with state-of-the-art registration methods, especially in high noise and outliers. Code is available at textit{https://github.com/FWen/mcme.git}.

Computer Vision and Pattern Recognition

Deep Consensus Learning

107 - Wei Sun , Tianfu Wu 2021

Both generative learning and discriminative learning have recently witnessed remarkable progress using Deep Neural Networks (DNNs). For structured input synthesis and structured output prediction problems (e.g., layout-to-image synthesis and image semantic segmentation respectively), they often are studied separately. This paper proposes deep consensus learning (DCL) for joint layout-to-image synthesis and weakly-supervised image semantic segmentation. The former is realized by a recently proposed LostGAN approach, and the latter by introducing an inference network as the third player joining the two-player game of LostGAN. Two deep consensus mappings are exploited to facilitate training the three networks end-to-end: Given an input layout (a list of object bounding boxes), the generator generates a mask (label map) and then use it to help synthesize an image. The inference network infers the mask for the synthesized image. Then, the latent consensus is measured between the mask generated by the generator and the one inferred by the inference network. For the real image corresponding to the input layout, its mask also is computed by the inference network, and then used by the generator to reconstruct the real image. Then, the data consensus is measured between the real image and its reconstructed image. The discriminator still plays the role of an adversary by computing the realness scores for a real image, its reconstructed image and a synthesized image. In experiments, our DCL is tested in the COCO-Stuff dataset. It obtains compelling layout-to-image synthesis results and weakly-supervised image semantic segmentation results.

Computer Vision and Pattern Recognition Machine Learning

Human Detection and Segmentation via Multi-view Consensus

78 - Isinsu Katircioglu , Helge Rhodin , Jorg Sporri 2020

Self-supervised detection and segmentation of foreground objects aims for accuracy without annotated training data. However, existing approaches predominantly rely on restrictive assumptions on appearance and motion. For scenes with dynamic activities and camera motion, we propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training via coarse 3D localization in a voxel grid and fine-grained offset regression. In this manner, we learn a joint distribution of proposals over multiple views. At inference time, our method operates on single RGB images. We outperform state-of-the-art techniques both on images that visually depart from those of standard benchmarks and on those of the classical Human3.6M dataset.

Computer Vision and Pattern Recognition

Sample Efficient Adaptive Text-to-Speech

55 - Yutian Chen , Yannis Assael , Brendan Shillingford 2018

We present a meta-learning approach for adaptive text-to-speech (TTS) with few data. During training, we learn a multi-speaker model using a shared conditional WaveNet core and independent learned embeddings for each speaker. The aim of training is not to produce a neural network with fixed weights, which is then deployed as a TTS system. Instead, the aim is to produce a network that requires few data at deployment time to rapidly adapt to new speakers. We introduce and benchmark three strategies: (i) learning the speaker embedding while keeping the WaveNet core fixed, (ii) fine-tuning the entire architecture with stochastic gradient descent, and (iii) predicting the speaker embedding with a trained neural network encoder. The experiments show that these approaches are successful at adapting the multi-speaker neural network to new speakers, obtaining state-of-the-art results in both sample naturalness and voice similarity with merely a few minutes of audio data from new speakers.

Machine Learning Sound Machine Learning