Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?

50 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Corentin Kervadec

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Corentin Kervadec

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Models for Visual Question Answering (VQA) are notorious for their tendency to rely on dataset biases, as the large and unbalanced diversity of questions and concepts involved and tends to prevent models from learning to reason, leading them to perform educated guesses instead. In this paper, we claim that the standard evaluation metric, which consists in measuring the overall in-domain accuracy, is misleading. Since questions and concepts are unbalanced, this tends to favor models which exploit subtle training set statistics. Alternatively, naively introducing artificial distribution shifts between train and test splits is also not completely satisfying. First, the shifts do not reflect real-world tendencies, resulting in unsuitable models; second, since the shifts are handcrafted, trained models are specifically designed for this particular setting, and do not generalize to other configurations. We propose the GQA-OOD benchmark designed to overcome these concerns: we measure and compare accuracy over both rare and frequent question-answer pairs, and argue that the former is better suited to the evaluation of reasoning abilities, which we experimentally validate with models trained to more or less exploit biases. In a large-scale study involving 7 VQA models and 3 bias reduction techniques, we also experimentally demonstrate that these models fail to address questions involving infrequent concepts and provide recommendations for future directions of research.

قيم البحث

81 - Yu Liu , Xuhui Jia , Mingxing Tan 2019

Standard Knowledge Distillation (KD) approaches distill the knowledge of a cumbersome teacher model into the parameters of a student model with a pre-defined architecture. However, the knowledge of a neural network, which is represented by the networ ks output distribution conditioned on its input, depends not only on its parameters but also on its architecture. Hence, a more generalized approach for KD is to distill the teachers knowledge into both the parameters and architecture of the student. To achieve this, we present a new Architecture-aware Knowledge Distillation (AKD) approach that finds student models (pearls for the teacher) that are best for distilling the given teacher model. In particular, we leverage Neural Architecture Search (NAS), equipped with our KD-guided reward, to search for the best student architectures for a given teacher. Experimental results show our proposed AKD consistently outperforms the conventional NAS plus KD approach, and achieves state-of-the-art results on the ImageNet classification task under various latency settings. Furthermore, the best AKD student architecture for the ImageNet classification task also transfers well to other tasks such as million level face recognition and ensemble learning.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Hard negative examples are hard, but useful

100 - Hong Xuan , Abby Stylianou , Xiaotong Liu 2020

Triplet loss is an extremely common approach to distance metric learning. Representations of images from the same class are optimized to be mapped closer together in an embedding space than representations of images from different classes. Much work on triplet losses focuses on selecting the most useful triplets of images to consider, with strategies that select dissimilar examples from the same class or similar examples from different classes. The consensus of previous research is that optimizing with the textit{hardest} negative examples leads to bad training behavior. Thats a problem -- these hardest negatives are literally the cases where the distance metric fails to capture semantic similarity. In this paper, we characterize the space of triplets and derive why hard negatives make triplet loss training fail. We offer a simple fix to the loss function and show that, with this fix, optimizing with hard negative examples becomes feasible. This leads to more generalizable features, and image retrieval results that outperform state of the art for datasets with high intra-class variance.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Quasars at intermediate redshift are not special; but they are often satellites

59 - Shadab Alam , Nicholas P. Ross , Sarah Eftekharzadeh 2020

Understanding the links between the activity of supermassive black holes (SMBH) at the centres of galaxies and their host dark matter haloes is a key question in modern astrophysics. The final data release of the SDSS-IV eBOSS provides the largest co ntemporary spectroscopic sample of galaxies and QSOs. Using this sample and covering the redshift interval $z=0.7-1.1$, we have measured the clustering properties of the eBOSS QSOs, Emission Line Galaxies (ELGs) and Luminous Red Galaxies (LRGs). We have also measured the fraction of QSOs as a function of the overdensity defined by the galaxy population. Using these measurements, we investigate how QSOs populate and sample the galaxy population, and how the host dark-matter haloes of QSOs sample the underlying halo distribution. We find that the probability of a galaxy hosting a QSO is independent of the host dark matter halo mass of the galaxy. We also find that about 60% of eBOSS QSOs are hosted by LRGs and about 20-40% of QSOs are hosted by satellite galaxies. We find a slight preference for QSOs to populate satellite galaxies over central galaxies. This is connected to the host halo mass distribution of different types of galaxies. Based on our analysis, QSOs should be hosted by a very broad distribution of haloes, and their occurrence should be modulated only by the efficiency of galaxy formation processes.

الفيزياء الفلكية من المجرات

On Clusters that are Separated but Large

101 - Sariel Har-Peled , Joseph Rogge 2021

$renewcommand{Re}{mathbb{R}}$Given a set $P$ of $n$ points in $Re^d$, consider the problem of computing $k$ subsets of $P$ that form clusters that are well-separated from each other, and each of them is large (cardinality wise). We provide tight uppe r and lower bounds, and corresponding algorithms, on the quality of separation, and the size of the clusters that can be computed, as a function of $n,d,k,s$, and $Phi$, where $s$ is the desired separation, and $Phi$ is the spread of the point set $P$.

الهندسة الحسابية

What are the Luminous Compact Blue Galaxies?

471 - D.J. Pisano 2007

Luminous Compact Blue Galaxies (LCBGs) are common at z~1, contributing significantly to the total star formation rate density. By z~0, they are a factor of ten rarer. While we know that LCBGs evolve rapidly, we do not know what drives their evolution nor into what types of galaxies they evolve. We present the results of a single-dish HI survey of local LCBGs undertaken to address these questions. Our results indicate that LCBGs have M(HI) and M(DYN) consistent with low-mass spirals, but typically exhaust their gas reservoirs in less than 2 Gyr. Overall, the properties of LCBGs are consistent with them evolving into high-mass dwarf elliptical or dwarf irregular galaxies or low-mass, late-type spiral galaxies.

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة حلب

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً