تعد تسجيل الإجابة القصيرة مهمة تقييم صحة نص قصير معين كاستجابة للسؤال الذي يمكن أن يأتي من مجموعة متنوعة من السيناريوهات التعليمية.كما هو المحتوى الوحيد، وليس النموذج، أمر مهم، يجب ألا يهم الصياغة الدقيقة بما في ذلك صريح الإجابة.ومع ذلك، فإن العديد من نماذج التسجيل الحديثة تعتمد بشدة على المعلومات المعجمية، سواء كانت تضمين كلمة في شبكة عصبية أو غرام N في SVM.وبالتالي، فإن الصياغة الدقيقة للإجابة قد تحدث فرقا كبيرا.لذلك نحن نحدد إلى أي مدى تحدث ظاهرة اللغة الضمنية في مجموعات بيانات الإجابة القصيرة وفحص التأثير لديهم على أداء التسجيل التلقائي.نجد أن مستوى الضمنية يعتمد على السؤال الفردي، وأن بعض الظواهر متكررة للغاية.حل الصياغة الضمنية للتركيز الصريح تميل بالفعل إلى تحسين أداء التسجيل التلقائي.
Short-answer scoring is the task of assessing the correctness of a short text given as response to a question that can come from a variety of educational scenarios. As only content, not form, is important, the exact wording including the explicitness of an answer should not matter. However, many state-of-the-art scoring models heavily rely on lexical information, be it word embeddings in a neural network or n-grams in an SVM. Thus, the exact wording of an answer might very well make a difference. We therefore quantify to what extent implicit language phenomena occur in short answer datasets and examine the influence they have on automatic scoring performance. We find that the level of implicitness depends on the individual question, and that some phenomena are very frequent. Resolving implicit wording to explicit formulations indeed tends to improve automatic scoring performance.
References used
https://aclanthology.org/
Data-to-text generation systems are trained on large datasets, such as WebNLG, Ro-toWire, E2E or DART. Beyond traditional token-overlap evaluation metrics (BLEU or METEOR), a key concern faced by recent generators is to control the factuality of the
The evaluation of question answering models compares ground-truth annotations with model predictions. However, as of today, this comparison is mostly lexical-based and therefore misses out on answers that have no lexical overlap but are still semanti
In education, quiz questions have become an important tool for assessing the knowledge of students. Yet, manually preparing such questions is a tedious task, and thus automatic question generation has been proposed as a possible alternative. So far,
Generating high quality question-answer pairs is a hard but meaningful task. Although previous works have achieved great results on answer-aware question generation, it is difficult to apply them into practical application in the education field. Thi
Existing text-based personality detection research mostly relies on data-driven approaches to implicitly capture personality cues in online posts, lacking the guidance of psychological knowledge. Psychological questionnaire, which contains a series o