Dont Judge an Object by Its Context: Learning to Overcome Contextual Bias

53 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Krishna Kumar Singh

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Krishna Kumar Singh - Dhruv Mahajan - Kristen Grauman

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Existing models often leverage co-occurrences between objects and their context to improve recognition accuracy. However, strongly relying on context risks a models generalizability, especially when typical co-occurrence patterns are absent. This work focuses on addressing such contextual biases to improve the robustness of the learnt feature representations. Our goal is to accurately recognize a category in the absence of its context, without compromising on performance when it co-occurs with context. Our key idea is to decorrelate feature representations of a category from its co-occurring context. We achieve this by learning a feature subspace that explicitly represents categories occurring in the absence of context along side a joint feature subspace that represents both categories and context. Our very simple yet effective method is extensible to two multi-label tasks -- object and attribute classification. On 4 challenging datasets, we demonstrate the effectiveness of our method in reducing contextual bias.

قيم البحث

56 - Sunnie S. Y. Kim , Sharon Zhang , Nicole Meister 2021

Singh et al. (2020) point out the dangers of contextual bias in visual recognition datasets. They propose two methods, CAM-based and feature-split, that better recognize an object or attribute in the absence of its typical context while maintaining c ompetitive within-context accuracy. To verify their performance, we attempted to reproduce all 12 tables in the original paper, including those in the appendix. We also conducted additional experiments to better understand the proposed methods, including increasing the regularization in CAM-based and removing the weighted loss in feature-split. As the original code was not made available, we implemented the entire pipeline from scratch in PyTorch 1.7.0. Our implementation is based on the paper and email exchanges with the authors. We found that both proposed methods in the original paper help mitigate contextual bias, although for some methods, we could not completely replicate the quantitative results in the paper even after completing an extensive hyperparameter search. For example, on COCO-Stuff, DeepFashion, and UnRel, our feature-split model achieved an increase in accuracy on out-of-context images over the standard baseline, whereas on AwA, we saw a drop in performance. For the proposed CAM-based method, we were able to reproduce the original papers results to within 0.5$%$ mAP. Our implementation can be found at https://github.com/princetonvisualai/ContextualBias.

الرؤية الحاسوبية وتمييز الأنماط

An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning

308 - Fan Wu , Zhongwen Xu , Yi Yang 2017

We propose an end-to-end approach to the natural language object retrieval task, which localizes an object within an image according to a natural language description, i.e., referring expression. Previous works divide this problem into two independen t stages: first, compute region proposals from the image without the exploration of the language description; second, score the object proposals with regard to the referring expression and choose the top-ranked proposals. The object proposals are generated independently from the referring expression, which makes the proposal generation redundant and even irrelevant to the referred object. In this work, we train an agent with deep reinforcement learning, which learns to move and reshape a bounding box to localize the object according to the referring expression. We incorporate both the spatial and temporal context information into the training procedure. By simultaneously exploiting local visual information, the spatial and temporal context and the referring language a priori, the agent selects an appropriate action to take at each time. A special action is defined to indicate when the agent finds the referred object, and terminate the procedure. We evaluate our model on various datasets, and our algorithm significantly outperforms the compared algorithms. Notably, the accuracy improvement of our method over the recent method GroundeR and SCRC on the ReferItGame dataset are 7.67% and 18.25%, respectively.

الرؤية الحاسوبية وتمييز الأنماط

Improvements to context based self-supervised learning

125 - T. Nathan Mundhenk , Daniel Ho , Barry Y. Chen 2017

We develop a set of methods to improve on the results of self-supervised learning using context. We start with a baseline of patch based arrangement context learning and go from there. Our methods address some overt problems such as chromatic aberrat ion as well as other potential problems such as spatial skew and mid-level feature neglect. We prevent problems with testing generalization on common self-supervised benchmark tests by using different datasets during our development. The results of our methods combined yield top scores on all standard self-supervised benchmarks, including classification and detection on PASCAL VOC 2007, segmentation on PASCAL VOC 2012, and linear tests on the ImageNet and CSAIL Places datasets. We obtain an improvement over our baseline method of between 4.0 to 7.1 percentage points on transfer learning classification tests. We also show results on different standard network architectures to demonstrate generalization as well as portability. All data, models and programs are available at: https://gdo-datasci.llnl.gov/selfsupervised/.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي الحوسبة العصبية والتطورية

Quantum Tensor Network in Machine Learning: An Application to Tiny Object Classification

76 - Fanjie Kong , Xiao-yang Liu , Ricardo Henao 2021

Tiny object classification problem exists in many machine learning applications like medical imaging or remote sensing, where the object of interest usually occupies a small region of the whole image. It is challenging to design an efficient machine learning model with respect to tiny object of interest. Current neural network structures are unable to deal with tiny object efficiently because they are mainly developed for images featured by large scale objects. However, in quantum physics, there is a great theoretical foundation guiding us to analyze the target function for image classification regarding to specific objects size ratio. In our work, we apply Tensor Networks to solve this arising tough machine learning problem. First, we summarize the previous work that connects quantum spin model to image classification and bring the theory into the scenario of tiny object classification. Second, we propose using 2D multi-scale entanglement renormalization ansatz (MERA) to classify tiny objects in image. In the end, our experimental results indicate that tensor network models are effective for tiny object classification problem and potentially will beat state-of-the-art. Our codes will be available online https://github.com/timqqt/MERA_Image_Classification.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Active Learning to Overcome Sample Selection Bias: Application to Photometric Variable Star Classification

351 - Joseph W. Richards , Dan L. Starr , Henrik Brink 2011

Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manif est as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because a) standard assumptions for machine-learned model selection procedures break down and b) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting (IW), co-training (CT), and active learning (AL). We argue that AL---where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up---is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and OGLE, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply active learning to classify variable stars in the ASAS survey, finding dramatic improvement in our agreement with the ACVS catalog, from 65.5% to 79.5%, and a significant increase in the classifiers average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.

الأجهزة والأساليب للزيئات الفيزياء الفلكية تطبيقات الإحصاء