Image-to-GPS Verification Through A Bottom-Up Pattern Matching Network

186 0 0.0 ( 0 )

Download Cite

Added by Jiaxin Cheng

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Jiaxin Cheng - Yue Wu - Wael Abd-Almageed

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The image-to-GPS verification problem asks whether a given image is taken at a claimed GPS location. In this paper, we treat it as an image verification problem -- whether a query image is taken at the same place as a reference image retrieved at the claimed GPS location. We make three major contributions: 1) we propose a novel custom bottom-up pattern matching (BUPM) deep neural network solution; 2) we demonstrate that the verification can be directly done by cross-checking a perspective-looking query image and a panorama reference image, and 3) we collect and clean a dataset of 30K pairs query and reference. Our experimental results show that the proposed BUPM solution outperforms the state-of-the-art solutions in terms of both verification and localization.

rate research

Grover search revisited; application to image pattern matching

106 - Hiroyuki Tezuka , Kouhei Nakaji , Takahiko Satoh 2021

The landmark Grover algorithm for amplitude amplification serves as an essential subroutine in various type of quantum algorithms, with guaranteed quantum speedup in query complexity. However, there have been no proposal to realize the original motivating application of the algorithm, i.e., the database search or more broadly the pattern matching in a practical setting, mainly due to the technical difficulty in efficiently implementing the data loading and amplitude amplification processes. In this paper, we propose a quantum algorithm that approximately executes the entire Grover database search or pattern matching algorithm. The key idea is to use the recently proposed approximate amplitude encoding method on a shallow quantum circuit, together with the easily implementable inversion-test operation for realizing the projected quantum state having similarity to the query data, followed by the amplitude amplification independent to the target index. We provide a thorough demonstration of the algorithm in the problem of image pattern matching.

Quantum Physics

Achieving Competitive Play Through Bottom-Up Approach in Semantic Segmentation

369 - E. Pryzant , Q. Deng , B. Mei 2021

With the renaissance of neural networks, object detection has slowly shifted from a bottom-up recognition problem to a top-down approach. Best in class algorithms enumerate a near-complete list of objects and classify each into object/not object. In this paper, we show that strong performance can still be achieved using a bottom-up approach for vision-based object recognition tasks and achieve competitive video game play. We propose PuckNet, which is used to detect four extreme points (top, left, bottom, and right-most points) and one center point of objects using a fully convolutional neural network. Object detection is then a purely keypoint-based appearance estimation problem, without implicit feature learning or region classification. The method proposed herein performs on-par with the best in class region-based detection methods, with a bounding box AP of 36.4% on COCO test-dev. In addition, the extreme points estimated directly resolve into a rectangular object mask, with a COCO Mask AP of 17.6%, outperforming the Mask AP of vanilla bounding boxes. Guided segmentation of extreme points further improves this to 32.1% Mask AP. We applied the PuckNet vision system to the SuperTuxKart video game to test its capacity to achieve competitive play in dynamic and co-operative multiplayer environments.

Computer Vision and Pattern Recognition

Graph Structured Network for Image-Text Matching

193 - Chunxiao Liu , Zhendong Mao , Tianzhu Zhang 2020

Image-text matching has received growing interest since it bridges vision and language. The key challenge lies in how to learn correspondence between image and text. Existing works learn coarse correspondence based on object co-occurrence statistics, while failing to learn fine-grained phrase correspondence. In this paper, we present a novel Graph Structured Matching Network (GSMN) to learn fine-grained correspondence. The GSMN explicitly models object, relation and attribute as a structured phrase, which not only allows to learn correspondence of object, relation and attribute separately, but also benefits to learn fine-grained correspondence of structured phrase. This is achieved by node-level matching and structure-level matching. The node-level matching associates each node with its relevant nodes from another modality, where the node can be object, relation or attribute. The associated nodes then jointly infer fine-grained correspondence by fusing neighborhood associations at structure-level matching. Comprehensive experiments show that GSMN outperforms state-of-the-art methods on benchmarks, with relative Recall@1 improvements of nearly 7% and 2% on Flickr30K and MSCOCO, respectively. Code will be released at: https://github.com/CrossmodalGroup/GSMN.

Computer Vision and Pattern Recognition

A Bottom-up Approach for Pancreas Segmentation using Cascaded Superpixels and (Deep) Image Patch Labeling

157 - Amal Farag , Le Lu , Holger R. Roth 2015

Robust automated organ segmentation is a prerequisite for computer-aided diagnosis (CAD), quantitative imaging analysis and surgical assistance. For high-variability organs such as the pancreas, previous approaches report undesirably low accuracies. We present a bottom-up approach for pancreas segmentation in abdominal CT scans that is based on a hierarchy of information propagation by classifying image patches at different resolutions; and cascading superpixels. There are four stages: 1) decomposing CT slice images as a set of disjoint boundary-preserving superpixels; 2) computing pancreas class probability maps via dense patch labeling; 3) classifying superpixels by pooling both intensity and probability features to form empirical statistics in cascaded random forest frameworks; and 4) simple connectivity based post-processing. The dense image patch labeling are conducted by: efficient random forest classifier on image histogram, location and texture features; and more expensive (but with better specificity) deep convolutional neural network classification on larger image windows (with more spatial contexts). Evaluation of the approach is performed on a database of 80 manually segmented CT volumes in six-fold cross-validation (CV). Our achieved results are comparable, or better than the state-of-the-art methods (evaluated by leave-one-patient-out), with Dice 70.7% and Jaccard 57.9%. The computational efficiency has been drastically improved in the order of 6~8 minutes, comparing with others of ~10 hours per case. Finally, we implement a multi-atlas label fusion (MALF) approach for pancreas segmentation using the same datasets. Under six-fold CV, our bottom-up segmentation method significantly outperforms its MALF counterpart: (70.7 +/- 13.0%) versus (52.5 +/- 20.8%) in Dice. Deep CNN patch labeling confidences offer more numerical stability, reflected by smaller standard deviations.

Computer Vision and Pattern Recognition

Image Retrieval and Pattern Spotting using Siamese Neural Network

61 - Kelly L. Wiggers , Alceu S. Britto Jr. , Laurent Heutte andn Alessandro L. Koerich 2019

This paper presents a novel approach for image retrieval and pattern spotting in document image collections. The manual feature engineering is avoided by learning a similarity-based representation using a Siamese Neural Network trained on a previously prepared subset of image pairs from the ImageNet dataset. The learned representation is used to provide the similarity-based feature maps used to find relevant image candidates in the data collection given an image query. A robust experimental protocol based on the public Tobacco800 document image collection shows that the proposed method compares favorably against state-of-the-art document image retrieval methods, reaching 0.94 and 0.83 of mean average precision (mAP) for retrieval and pattern spotting (IoU=0.7), respectively. Besides, we have evaluated the proposed method considering feature maps of different sizes, showing the impact of reducing the number of features in the retrieval performance and time-consuming.

Computer Vision and Pattern Recognition