ﻻ يوجد ملخص باللغة العربية
An interactive image retrieval system learns which images in the database belong to a users query concept, by analyzing the example images and feedback provided by the user. The challenge is to retrieve the relevant images with minimal user interaction. In this work, we propose to solve this problem by posing it as a binary classification task of classifying all images in the database as being relevant or irrelevant to the users query concept. Our method combines active learning with graph-based semi-supervised learning (GSSL) to tackle this problem. Active learning reduces the number of user interactions by querying the labels of the most informative points and GSSL allows to use abundant unlabeled data along with the limited labeled data provided by the user. To efficiently find the most informative point, we use an uncertainty sampling based method that queries the label of the point nearest to the decision boundary of the classifier. We estimate this decision boundary using our heuristic of adaptive threshold. To utilize huge volumes of unlabeled data we use an efficient approximation based method that reduces the complexity of GSSL from $O(n^3)$ to $O(n)$, making GSSL scalable. We make the classifier robust to the diversity and noisy labels associated with images in large databases by incorporating information from multiple modalities such as visual information extracted from deep learning based models and semantic information extracted from the WordNet. High F1 scores within few relevance feedback rounds in our experiments with concepts defined on AnimalWithAttributes and Imagenet (1.2 million images) datasets indicate the effectiveness and scalability of our approach.
To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data. However, (i
In semi-supervised classification, one is given access both to labeled and unlabeled data. As unlabeled data is typically cheaper to acquire than labeled data, this setup becomes advantageous as soon as one can exploit the unlabeled data in order to
Knowledge distillation is a widely used technique for model compression. We posit that the teacher model used in a distillation setup, captures relationships between classes, that extend beyond the original dataset. We empirically show that a teacher
Data augmentation by incorporating cheap unlabeled data from multiple domains is a powerful way to improve prediction especially when there is limited labeled data. In this work, we investigate how adversarial robustness can be enhanced by leveraging
Image retrieval relies heavily on the quality of the data modeling and the distance measurement in the feature space. Building on the concept of image manifold, we first propose to represent the feature space of images, learned via neural networks, a