Improving Deep Metric Learning by Divide and Conquer

99 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Artsiom Sanakoyeu

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Artsiom Sanakoyeu - Pingchuan Ma - Vadim Tschernezki

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. The target similarity on the training data is defined by user in form of ground-truth class labels. However, while the embedding space learns to mimic the user-provided similarity on the training data, it should also generalize to novel categories not seen during training. Besides user-provided groundtruth training labels, a lot of additional visual factors (such as viewpoint changes or shape peculiarities) exist and imply different notions of similarity between objects, affecting the generalization on the images unseen during training. However, existing approaches usually directly learn a single embedding space on all available training data, struggling to encode all different types of relationships, and do not generalize well. We propose to build a more expressive representation by jointly splitting the embedding space and the data hierarchically into smaller sub-parts. We successively focus on smaller subsets of the training data, reducing its variance and learning a different embedding subspace for each data subset. Moreover, the subspaces are learned jointly to cover not only the intricacies, but the breadth of the data as well. Only after that, we build the final embedding from the subspaces in the conquering stage. The proposed algorithm acts as a transparent wrapper that can be placed around arbitrary existing DML methods. Our approach significantly improves upon the state-of-the-art on image retrieval, clustering, and re-identification tasks evaluated using CUB200-2011, CARS196, Stanford Online Products, In-shop Clothes, and PKU VehicleID datasets.

قيم البحث

93 - Artsiom Sanakoyeu , Vadim Tschernezki , Uta Buchler 2019

Learning the embedding space, where semantically similar objects are located close together and dissimilar objects far apart, is a cornerstone of many computer vision applications. Existing approaches usually learn a single metric in the embedding sp ace for all available data points, which may have a very complex non-uniform distribution with different notions of similarity between objects, e.g. appearance, shape, color or semantic meaning. Approaches for learning a single distance metric often struggle to encode all different types of relationships and do not generalize well. In this work, we propose a novel easy-to-implement divide and conquer approach for deep metric learning, which significantly improves the state-of-the-art performance of metric learning. Our approach utilizes the embedding space more efficiently by jointly splitting the embedding space and data into $K$ smaller sub-problems. It divides both, the data and the embedding space into $K$ subsets and learns $K$ separate distance metrics in the non-overlapping subspaces of the embedding space, defined by groups of neurons in the embedding layer of the neural network. The proposed approach increases the convergence speed and improves generalization since the complexity of each sub-problem is reduced compared to the original one. We show that our approach outperforms the state-of-the-art by a large margin in retrieval, clustering and re-identification tasks on CUB200-2011, CARS196, Stanford Online Products, In-shop Clothes and PKU VehicleID datasets.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

PairRank: Online Pairwise Learning to Rank by Divide-and-Conquer

91 - Yiling Jia , Huazheng Wang , Stephen Guo 2021

Online Learning to Rank (OL2R) eliminates the need of explicit relevance annotation by directly optimizing the rankers from their interactions with users. However, the required exploration drives it away from successful practices in offline learning to rank, which limits OL2Rs empirical performance and practical applicability. In this work, we propose to estimate a pairwise learning to rank model online. In each round, candidate documents are partitioned and ranked according to the models confidence on the estimated pairwise rank order, and exploration is only performed on the uncertain pairs of documents, i.e., emph{divide-and-conquer}. Regret directly defined on the number of mis-ordered pairs is proven, which connects the online solutions theoretical convergence with its expected ranking performance. Comparisons against an extensive list of OL2R baselines on two public learning to rank benchmark datasets demonstrate the effectiveness of the proposed solution.

التعلم الآلي استرجاع المعلومات

Bound and Conquer: Improving Triangulation by Enforcing Consistency

66 - Adam Scholefield , Alireza Ghasemi , Martin Vetterli 2018

We study the accuracy of triangulation in multi-camera systems with respect to the number of cameras. We show that, under certain conditions, the optimal achievable reconstruction error decays quadratically as more cameras are added to the system. Fu rthermore, we analyse the error decay-rate of major state-of-the-art algorithms with respect to the number of cameras. To this end, we introduce the notion of consistency for triangulation, and show that consistent reconstruction algorithms achieve the optimal quadratic decay, which is asymptotically faster than some other methods. Finally, we present simulations results supporting our findings. Our simulations have been implemented in MATLAB and the resulting code is available in the supplementary material.

الرؤية الحاسوبية وتمييز الأنماط

Improving Image co-segmentation via Deep Metric Learning

259 - Zhengwen Li , Xiabi Liu 2021

Deep Metric Learning (DML) is helpful in computer vision tasks. In this paper, we firstly introduce DML into image co-segmentation. We propose a novel Triplet loss for Image Segmentation, called IS-Triplet loss for short, and combine it with traditio nal image segmentation loss. Different from the general DML task which learns the metric between pictures, we treat each pixel as a sample, and use their embedded features in high-dimensional space to form triples, then we tend to force the distance between pixels of different categories greater than of the same category by optimizing IS-Triplet loss so that the pixels from different categories are easier to be distinguished in the high-dimensional feature space. We further present an efficient triple sampling strategy to make a feasible computation of IS-Triplet loss. Finally, the IS-Triplet loss is combined with 3 traditional image segmentation losses to perform image segmentation. We apply the proposed approach to image co-segmentation and test it on the SBCoseg dataset and the Internet dataset. The experimental result shows that our approach can effectively improve the discrimination of pixels categories in high-dimensional space and thus help traditional loss achieve better performance of image segmentation with fewer training epochs.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Divide and Conquer Networks

206 - Alex Nowak-Vila , David Folque , Joan Bruna 2016

We consider the learning of algorithmic tasks by mere observation of input-output pairs. Rather than studying this as a black-box discrete regression problem with no assumption whatsoever on the input-output mapping, we concentrate on tasks that are amenable to the principle of divide and conquer, and study what are its implications in terms of learning. This principle creates a powerful inductive bias that we leverage with neural architectures that are defined recursively and dynamically, by learning two scale-invariant atomic operations: how to split a given input into smaller sets, and how to merge two partially solved tasks into a larger partial solution. Our model can be trained in weakly supervised environments, namely by just observing input-output pairs, and in even weaker environments, using a non-differentiable reward signal. Moreover, thanks to the dynamic aspect of our architecture, we can incorporate the computational complexity as a regularization term that can be optimized by backpropagation. We demonstrate the flexibility and efficiency of the Divide-and-Conquer Network on several combinatorial and geometric tasks: convex hull, clustering, knapsack and euclidean TSP. Thanks to the dynamic programming nature of our model, we show significant improvements in terms of generalization error and computational complexity.

التعلم الآلي التعلم الالي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشھباء الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Improving Deep Metric Learning by Divide and Conquer

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً