Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

The Multilingual Corpus of Survey Questionnaires Query Interface

وجعة متعددة اللغات من استبيانات الاستبيان واجهة الاستعلام

591 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The dawn of the digital age led to increasing demands for digital research resources, which shall be quickly processed and handled by computers. Due to the amount of data created by this digitization process, the design of tools that enable the analysis and management of data and metadata has become a relevant topic. In this context, the Multilingual Corpus of Survey Questionnaires (MCSQ) contributes to the creation and distribution of data for the Social Sciences and Humanities (SSH) following FAIR (Findable, Accessible, Interoperable and Reusable) principles, and provides functionalities for end-users that are not acquainted with programming through an easy-to-use interface. By simply applying the desired filters in the graphic interface, users can build linguistic resources for the survey research and translation areas, such as translation memories, thus facilitating data access and usage.

References used

https://aclanthology.org/

rate research

A Corpus for Multilingual Analysis of Online Terms of Service

696 - Association for Computation Linguistics 2021 مقالة

We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service. The data set comprises a total of 100 contracts, obtained from 25 documents annotated in four different languages: English, Germ an, Italian, and Polish. For each contract, potentially unfair clauses for the consumer are annotated, for nine different unfairness categories. We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages.

terms of service online terms multilingual analysis شروط الخدمة مصطلحات الانترنت تحليل متعدد اللغات صناعة حمض الفوسفور المزيد..

Multilingual Image Corpus: Annotation Protocol

781 - Association for Computation Linguistics 2021 مقالة

In this paper, we present work in progress aimed at the development of a new image dataset with annotated objects. The Multilingual Image Corpus consists of an ontology of visual objects (based on WordNet) and a collection of thematically related ima ges annotated with segmentation masks and object classes. We identified 277 dominant classes and 1,037 parent and attribute classes, and grouped them into 10 thematic domains such as sport, medicine, education, food, security, etc. For the selected classes a large-scale web image search is being conducted in order to compile a substantial collection of high-quality copyright free images. The focus of the paper is the annotation protocol which we established to facilitate the annotation process: the Ontology of visual objects and the conventions for image selection and for object segmentation. The dataset is designed both for image classification and object detection and for semantic segmentation. In addition, the object annotations will be supplied with multilingual descriptions by using freely available wordnets.

multilingual image corpus image corpus image corpus consists صورة متعددة اللغات Corpus. صورة Corpus. الصورة Corpus تتكون صناعة حمض الفوسفور المزيد..

Comment Section Personalization: Algorithmic, Interface, and Interaction Design

654 - Association for Computation Linguistics 2021 مقالة

Comment sections allow users to share their personal experiences, discuss and form different opinions, and build communities out of organic conversations. However, many comment sections present chronological ranking to all users. In this paper, I dis cuss personalization approaches in comment sections based on different objectives for newsrooms and researchers to consider. I propose algorithmic and interface designs when personalizing the presentation of comments based on different objectives including relevance, diversity, and education/background information. I further explain how transparency, user control, and comment type diversity could help users most benefit from the personalized interacting experience.

استكشاف اللغة العصبية comment sections comment section personalization تصميم تفاعلي قسم التعليق قسم التعليق التخصيص صناعة حمض الفوسفور المزيد..

Multilingual ELMo and the Effects of Corpus Sampling

572 - Association for Computation Linguistics 2021 مقالة

Multilingual pretrained language models are rapidly gaining popularity in NLP systems for non-English languages. Most of these models feature an important corpus sampling step in the process of accumulating training data in different languages, to en sure that the signal from better resourced languages does not drown out poorly resourced ones. In this study, we train multiple multilingual recurrent language models, based on the ELMo architecture, and analyse both the effect of varying corpus size ratios on downstream performance, as well as the performance difference between monolingual models for each language, and broader multilingual language models. As part of this effort, we also make these trained models available for public use.

corpus sampling important corpus sampling corpus sampling step أخذ العينات كوربوس أخذ أخذ العينات كوربوس المهمة كوربوس أخذ العينات الخطوة صناعة حمض الفوسفور المزيد..

Discovering Better Model Architectures for Medical Query Understanding

829 - Association for Computation Linguistics 2021 مقالة

In developing an online question-answering system for the medical domains, natural language inference (NLI) models play a central role in question matching and intention detection. However, which models are best for our datasets? Manually selecting o r tuning a model is time-consuming. Thus we experiment with automatically optimizing the model architectures on the task at hand via neural architecture search (NAS). First, we formulate a novel architecture search space based on the previous NAS literature, supporting cross-sentence attention (cross-attn) modeling. Second, we propose to modify the ENAS method to accelerate and stabilize the search results. We conduct extensive experiments on our two medical NLI tasks. Results show that our system can easily outperform the classical baseline models. We compare different NAS methods and demonstrate our approach provides the best results.

medical query understanding query understanding medical query استفسار الطبية فهم استفسار فهم الاستعلام الطبي صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

The Multilingual Corpus of Survey Questionnaires Query Interface

وجعة متعددة اللغات من استبيانات الاستبيان واجهة الاستعلام

Ask ChatGPT about the research

Read More

suggested questions