ندرس تأثير استخدام الأوصاف النصية الغنية والمتنوعة من الفصول الدراسية للتعلم الصفرية (ZSL) على ImageNet.نقوم بإنشاء مجموعة بيانات جديدة Imagenet-Wiki التي تتطابق مع كل فئة Imagenet إلى مقالها في ويكيبيديا المقابل.نظهر أن استخدام هذه المقالات في ويكيبيديا فقط كصامإصاف فئة يؤدي إلى ارتفاع أداء ZSL أعلى بكثير من الأعمال السابقة.حتى نموذج بسيط باستخدام هذا النوع من البيانات المساعدة تفوق النماذج الحديثة التي تعتمد على ميزات قياسية من Word تضمين ترميزات أسماء الفئة.تسليط الضوء على هذه النتائج فائدة وأهمية الأوصاف النصية ل ZSL، بالإضافة إلى الأهمية النسبية لنوع البيانات الإضافية مقارنة بالتقدم المحرز في الخوارزمية.تظهر نتائجنا التجريبية أيضا أن نهج التعلم المعيارية بالرصاص المعيارية تعميم بشكل سيء عبر فئات الطبقات.
We study the impact of using rich and diverse textual descriptions of classes for zero-shot learning (ZSL) on ImageNet. We create a new dataset ImageNet-Wiki that matches each ImageNet class to its corresponding Wikipedia article. We show that merely employing these Wikipedia articles as class descriptions yields much higher ZSL performance than prior works. Even a simple model using this type of auxiliary data outperforms state-of-the-art models that rely on standard features of word embedding encodings of class names. These results highlight the usefulness and importance of textual descriptions for ZSL, as well as the relative importance of auxiliary data type compared to the algorithmic progress. Our experimental results also show that standard zero-shot learning approaches generalize poorly across categories of classes.
References used
https://aclanthology.org/
Large-Scale Multi-Label Text Classification (LMTC) includes tasks with hierarchical label spaces, such as automatic assignment of ICD-9 codes to discharge summaries. Performance of models in prior art is evaluated with standard precision, recall, and
Contextual advertising provides advertisers with the opportunity to target the context which is most relevant to their ads. The large variety of potential topics makes it very challenging to collect training documents to build a supervised classifica
We present the ongoing NorLM initiative to support the creation and use of very large contextualised language models for Norwegian (and in principle other Nordic languages), including a ready-to-use software environment, as well as an experience repo
This paper illustrates our approach to the shared task on large-scale multilingual machine translation in the sixth conference on machine translation (WMT-21). In this work, we aim to build a single multilingual translation system with a hypothesis t
This paper describes TenTrans large-scale multilingual machine translation system for WMT 2021. We participate in the Small Track 2 in five South East Asian languages, thirty directions: Javanese, Indonesian, Malay, Tagalog, Tamil, English. We mainly