LiveSketch: Query Perturbations for Guided Sketch-based Visual Search

95 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل John Collomosse

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف John Collomosse - Tu Bui - Hailin Jin

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

LiveSketch is a novel algorithm for searching large image collections using hand-sketched queries. LiveSketch tackles the inherent ambiguity of sketch search by creating visual suggestions that augment the query as it is drawn, making query specification an iterative rather than one-shot process that helps disambiguate users search intent. Our technical contributions are: a triplet convnet architecture that incorporates an RNN based variational autoencoder to search for images using vector (stroke-based) queries; real-time clustering to identify likely search intents (and so, targets within the search embedding); and the use of backpropagation from those targets to perturb the input stroke sequence, so suggesting alterations to the query in order to guide the search. We show improvements in accuracy and time-to-task over contemporary baselines using a 67M image corpus.

قيم البحث

148 - Kaibo Cao 2021

As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the users intention and the textual query, and the semantic gap between the query and the post content. Therefore, developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms, etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. Our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the users original query. The evaluation results show that our approach outperforms five state-of-the-art baselines, and achieves a 5.6% to 33.5% boost in terms of $mathit{ExactMatch}$ and a 4.8% to 14.4% boost in terms of $mathit{GLEU}$.

هندسة البرمجيات الذكاء الاصطناعي استرجاع المعلومات

Generalisation and Sharing in Triplet Convnets for Sketch based Visual Search

220 - Tu Bui , Leonardo Ribeiro , Moacir Ponti 2016

We propose and evaluate several triplet CNN architectures for measuring the similarity between sketches and photographs, within the context of the sketch based image retrieval (SBIR) task. In contrast to recent fine-grained SBIR work, we study the ab ility of our networks to generalise across diverse object categories from limited training data, and explore in detail strategies for weight sharing, pre-processing, data augmentation and dimensionality reduction. We exceed the performance of pre-existing techniques on both the Flickr15k category level SBIR benchmark by $18%$, and the TU-Berlin SBIR benchmark by $sim10 mathcal{T}_b$, when trained on the 250 category TU-Berlin classification dataset augmented with 25k corresponding photographs harvested from the Internet.

الرؤية الحاسوبية وتمييز الأنماط

Deep Search Query Intent Understanding

70 - Xiaowei Liu , Weiwei Guo , Huiji Gao 2020

Understanding a users query intent behind a search is critical for modern search engine success. Accurate query intent prediction allows the search engine to better serve the users need by rendering results from more relevant categories. This paper a ims to provide a comprehensive learning framework for modeling query intent under different stages of a search. We focus on the design for 1) predicting users intents as they type in queries on-the-fly in typeahead search using character-level models; and 2) accurate word-level intent prediction models for complete queries. Various deep learning components for query text understanding are experimented. Offline evaluation and online A/B test experiments show that the proposed methods are effective in understanding query intent and efficient to scale for online search systems.

الحساب واللغة الذكاء الاصطناعي استرجاع المعلومات

Sketch-Guided Scenery Image Outpainting

101 - Yaxiong Wang , Yunchao Wei , Xueming Qian 2020

The outpainting results produced by existing approaches are often too random to meet users requirement. In this work, we take the image outpainting one step forward by allowing users to harvest personal custom outpainting results using sketches as th e guidance. To this end, we propose an encoder-decoder based network to conduct sketch-guided outpainting, where two alignment modules are adopted to impose the generated content to be realistic and consistent with the provided sketches. First, we apply a holistic alignment module to make the synthesized part be similar to the real one from the global view. Second, we reversely produce the sketches from the synthesized part and encourage them be consistent with the ground-truth ones using a sketch alignment module. In this way, the learned generator will be imposed to pay more attention to fine details and be sensitive to the guiding sketches. To our knowledge, this work is the first attempt to explore the challenging yet meaningful conditional scenery image outpainting. We conduct extensive experiments on two collected benchmarks to qualitatively and quantitatively validate the effectiveness of our approach compared with the other state-of-the-art generative models.

الرؤية الحاسوبية وتمييز الأنماط علوم الكمبيوتر ونظرية الألعاب

Compositional Sketch Search

81 - Alexander Black , Tu Bui , Long Mai 2021

We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects. Sketch based image retrieval (SBIR) methods predominantly match queries containing a single, dom inant object invariant to its position within an image. Our work exploits drawings as a concise and intuitive representation for specifying entire scene compositions. We train a convolutional neural network (CNN) to encode masked visual features from sketched objects, pooling these into a spatial descriptor encoding the spatial relationships and appearances of objects in the composition. Training the CNN backbone as a Siamese network under triplet loss yields a metric search embedding for measuring compositional similarity which may be efficiently leveraged for visual search by applying product quantization.

الرؤية الحاسوبية وتمييز الأنماط