Research papers, master and doctoral theses about اختبار

بناء. اختبار تشخيصي لصعوبات التعلم في الرياضيات لتلاميذ. الصفوف الاربعة الاولى من التعليم الاساسي

518 - Damascus University 2022 رسالة ماجستير

يهدف البحث لبناء اختبار لصعوبات تعليمية في الرياضيات وتشخيصها وتم تطبيقه على عينة استطلاعية بلغت 200 تلميذ وتلميذة، وعينة سيكومترية 400، واساسية 2671 وتم استخدام العديد من الادوات، وخلص البحث لبناء اختبار خاص بكل صف دراسي

اختبار تشخيصي، صعوبات تعلم، رياضيات

Sorting through the noise: Testing robustness of information processing in pre-trained language models

362 - Association for Computation Linguistics 2021 مقالة

Pre-trained LMs have shown impressive performance on downstream NLP tasks, but we have yet to establish a clear understanding of their sophistication when it comes to processing, retaining, and applying information presented in their input. In this p aper we tackle a component of this question by examining robustness of models' ability to deploy relevant context information in the face of distracting content. We present models with cloze tasks requiring use of critical context information, and introduce distracting content to test how robustly the models retain and use that critical information for prediction. We also systematically manipulate the nature of these distractors, to shed light on dynamics of models' use of contextual cues. We find that although models appear in simple contexts to make predictions based on understanding and applying relevant facts from prior context, the presence of distracting but irrelevant content has clear impact in confusing model predictions. In particular, models appear particularly susceptible to factors of semantic similarity and word position. The findings are consistent with the conclusion that LM predictions are driven in large part by superficial contextual cues, rather than by robust representations of context meaning.

testing robustness اختبار المتانة صناعة حمض الفوسفور

An Analysis of Euclidean vs. Graph-Based Framing for Bilingual Lexicon Induction from Word Embedding Spaces

237 - Association for Computation Linguistics 2021 مقالة

Much recent work in bilingual lexicon induction (BLI) views word embeddings as vectors in Euclidean space. As such, BLI is typically solved by finding a linear transformation that maps embeddings to a common space. Alternatively, word embeddings may be understood as nodes in a weighted graph. This framing allows us to examine a node's graph neighborhood without assuming a linear transform, and exploits new techniques from the graph matching optimization literature. These contrasting approaches have not been compared in BLI so far. In this work, we study the behavior of Euclidean versus graph-based approaches to BLI under differing data conditions and show that they complement each other when combined. We release our code at https://github.com/kellymarchisio/euc-v-graph-bli.

محول الوقت اختبار bilingual lexicon معجم ثنائي اللغة. صناعة حمض الفوسفور

CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization

176 - Association for Computation Linguistics 2021 مقالة

One challenge in evaluating visual question answering (VQA) models in the cross-dataset adaptation setting is that the distribution shifts are multi-modal, making it difficult to identify if it is the shifts in visual or language features that play a key role. In this paper, we propose a semi-automatic framework for generating disentangled shifts by introducing a controllable visual question-answer generation (VQAG) module that is capable of generating highly-relevant and diverse question-answer pairs with the desired dataset style. We use it to create CrossVQA, a collection of test splits for assessing VQA generalization based on the VQA2, VizWiz, and Open Images datasets. We provide an analysis of our generated datasets and demonstrate its utility by using them to evaluate several state-of-the-art VQA systems. One important finding is that the visual shifts in cross-dataset VQA matter more than the language shifts. More broadly, we present a scalable framework for systematically evaluating the machine with little human intervention.

scalably generating benchmarks testing vqa generalization systematically testing vqa توليد المعايير المتوسطة اختبار تعميم VQA. اختبار النظامية VQA. صناعة حمض الفوسفور المزيد..

Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text

174 - Association for Computation Linguistics 2021 مقالة

Communicating with humans is challenging for AIs because it requires a shared understanding of the world, complex semantics (e.g., metaphors or analogies), and at times multi-modal gestures (e.g., pointing with a finger, or an arrow in a diagram). We investigate these challenges in the context of Iconary, a collaborative game of drawing and guessing based on Pictionary, that poses a novel challenge for the research community. In Iconary, a Guesser tries to identify a phrase that a Drawer is drawing by composing icons, and the Drawer iteratively revises the drawing to help the Guesser in response. This back-and-forth often uses canonical scenes, visual metaphor, or icon compositions to express challenging words, making it an ideal test for mixing language and visual/symbolic communication in AI. We propose models to play Iconary and train them on over 55,000 games between human players. Our models are skillful players and are able to employ world knowledge in language models to play with words unseen during training.

testing multimodal communication testing multimodal multimodal communication اختبار الاتصالات متعددة الوسائط اختبار multimodal. الاتصالات متعددة الوسائط صناعة حمض الفوسفور المزيد..

Efficient Test Time Adapter Ensembling for Low-resource Language Varieties

173 - Association for Computation Linguistics 2021 مقالة

Adapters are light-weight modules that allow parameter-efficient fine-tuning of pretrained models. Specialized language and task adapters have recently been proposed to facilitate cross-lingual transfer of multilingual pretrained models (Pfeiffer et al., 2020b). However, this approach requires training a separate language adapter for every language one wishes to support, which can be impractical for languages with limited data. An intuitive solution is to use a related language adapter for the new language variety, but we observe that this solution can lead to sub-optimal performance. In this paper, we aim to improve the robustness of language adapters to uncovered languages without training new adapters. We find that ensembling multiple existing language adapters makes the fine-tuned model significantly more robust to other language varieties not included in these adapters. Building upon this observation, we propose Entropy Minimized Ensemble of Adapters (EMEA), a method that optimizes the ensemble weights of the pretrained language adapters for each test sentence by minimizing the entropy of its predictions. Experiments on three diverse groups of language varieties show that our method leads to significant improvements on both named entity recognition and part-of-speech tagging across all languages.

efficient test time time adapter ensembling test time adapter وقت الاختبار الفعال محول الوقت الكشف محول الوقت اختبار صناعة حمض الفوسفور المزيد..

200 - Association for Computation Linguistics 2021 مقالة

Multiple-choice questions (MCQs) are widely used in knowledge assessment in educational institutions, during work interviews, in entertainment quizzes and games. Although the research on the automatic or semi-automatic generation of multiple-choice t est items has been conducted since the beginning of this millennium, most approaches focus on generating questions from a single sentence. In this research, a state-of-the-art method of creating questions based on multiple sentences is introduced. It was inspired by semantic similarity matches used in the translation memory component of translation management systems. The performance of two deep learning algorithms, doc2vec and SBERT, is compared for the paragraph similarity task. The experiments are performed on the ad-hoc corpus within the EU domain. For the automatic evaluation, a smaller corpus of manually selected matching paragraphs has been compiled. The results prove the good performance of Sentence Embeddings for the given task.

multiple-choice test items generating multiple-choice test multiple-choice test عناصر اختبار متعددة الخيارات توليد اختبار متعدد الخيارات تىسىؤابىؤاللارتبؤتي صناعة حمض الفوسفور المزيد..

GOT: Testing for Originality in Natural Language Generation

426 - Association for Computation Linguistics 2021 مقالة

We propose an approach to automatically test for originality in generation tasks where no standard automatic measures exist. Our proposal addresses original uses of language, not necessarily original ideas. We provide an algorithm for our approach an d a run-time analysis. The algorithm, which finds all of the original fragments in a ground-truth corpus and can reveal whether a generated fragment copies an original without attribution, has a run-time complexity of theta(nlogn) where n is the number of sentences in the ground truth.

أوراق نقل النمط testing for originality originality in natural اختبار للأصالة الأصالة في الطبيعية صناعة حمض الفوسفور

علاقة الذكاء بالتحصيل في مادة الإنشاء التصويري

585 - جامعة بابل 2021 ورقة بحثية

يقترن وجود الجامعة بالفكر والعلم والحضارة التي تتحكم بسياقات تطور المجتمع ونقلاته النوعية والتاريخية من مرحلة إلى مرحلة أخرى أرقى منها، لذلك كانت الجامعة وما زالت مؤسسة تتميز بحكم طبيعتها وبحكم المهام التي تؤديها عن المؤسسات الأخرى، وتكون منطلق الحرك ة الفكرية ومآل التطورات التي تحدث في شتى أقطار العالم المتقدمة. نجد مما تقدم من النتائج ومناقشتها، إمكانية الخروج بعدد من الاستنتاجات لعل من أهمها ما يأتي: إن الارتباط عال بين ذكاء الطلبة عموماً وتحصيلهم بمادة الإنشاء التصويري. إن الذكاء ليس العامل الوحيد لتحقيق مستوى أفضل من الأداء في مادة الإنشاء التصويري، وإنما هنالك عوامل أخرى لعل من أهمها وجود متطلبات التعلم المسبقة والتي يمكن كشفها من خلال الأداء في اختبار القبول في الكلية. يختفي تأثير عامل الجنس في العلاقة بين الذكاء والتحصيل لدى الطلبة الذين يمتلكون متطلبات تعلم مسبقة ويتلقون تدريباً أكثر في مجال الرسم ولكن هذا العامل (الجنس) يظهر أثره لدى الطلبة الذين تنخفض لديهم هذه المتطلبات وتقل مدة تدريسهم، إذ ظهر وجود علاقة بين ذكاء الذكور الذين لا يمتلكون متطلبات تعلم مسبقة وتحصيلهم، في حين لم تظهر تلك العلاقة عند الإناث.

الذكاء والتحصيل الدراسي اختبار المصفوفات المتتابعة المعياري الاستبيان المغلق اختبار الذكاء

GPT Perdetry Test: Generating new meanings for new words

122 - Association for Computation Linguistics 2021 مقالة

Human innovation in language, such as inventing new words, is a challenge for pretrained language models. We assess the ability of one large model, GPT-3, to process new words and decide on their meaning. We create a set of nonce words and prompt GPT -3 to generate their dictionary definitions. We find GPT-3 produces plausible definitions that align with human judgments. Moreover, GPT-3's definitions are sometimes preferred to those invented by humans, signaling its intriguing ability not just to adapt, but to add to the evolving vocabulary of the English language.

gpt perdetry test gpt perdetry perdetry test جي جي اختبار الدقيقة اختبار الدواء صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد