No Arabic abstract
In the last several years, the field of computer assisted language learning has increasingly focused on computer aided question generation. However, this approach often provides test takers with an exhaustive amount of questions that are not designed for any specific testing purpose. In this work, we present a personalized computer aided question generation that generates multiple choice questions at various difficulty levels and types, including vocabulary, grammar and reading comprehension. In order to improve the weaknesses of test takers, it selects questions depending on an estimated proficiency level and unclear concepts behind incorrect responses. This results show that the students with the personalized automatic quiz generation corrected their mistakes more frequently than ones only with computer aided question generation. Moreover, students demonstrated the most progress between the pretest and post test and correctly answered more difficult questions. Finally, we investigated the personalizing strategy and found that a student could make a significant progress if the proposed system offered the vocabulary questions at the same level of his or her proficiency level, and if the grammar and reading comprehension questions were at a level lower than his or her proficiency level.
This paper proposes a novel and statistical method of ability estimation based on acquisition distribution for a personalized computer aided question generation. This method captures the learning outcomes over time and provides a flexible measurement based on the acquisition distributions instead of precalibration. Compared to the previous studies, the proposed method is robust, especially when an ability of a student is unknown. The results from the empirical data show that the estimated abilities match the actual abilities of learners, and the pretest and post-test of the experimental group show significant improvement. These results suggest that this method can serves as the ability estimation for a personalized computer-aided testing environment.
In text generation evaluation, many practical issues, such as inconsistent experimental settings and metric implementations, are often ignored but lead to unfair evaluation and untenable conclusions. We present CoTK, an open-source toolkit aiming to support fast development and fair evaluation of text generation. In model development, CoTK helps handle the cumbersome issues, such as data processing, metric implementation, and reproduction. It standardizes the development steps and reduces human errors which may lead to inconsistent experimental settings. In model evaluation, CoTK provides implementation for many commonly used metrics and benchmark models across different experimental settings. As a unique feature, CoTK can signify when and which metric cannot be fairly compared. We demonstrate that it is convenient to use CoTK for model development and evaluation, particularly across different experimental settings.
Engineering sketches form the 2D basis of parametric Computer-Aided Design (CAD), the foremost modeling paradigm for manufactured objects. In this paper we tackle the problem of learning based engineering sketch generation as a first step towards synthesis and composition of parametric CAD models. We propose two generative models, CurveGen and TurtleGen, for engineering sketch generation. Both models generate curve primitives without the need for a sketch constraint solver and explicitly consider topology for downstream use with constraints and 3D CAD modeling operations. We find in our perceptual evaluation using human subjects that both CurveGen and TurtleGen produce more realistic engineering sketches when compared with the current state-of-the-art for engineering sketch generation.
HCI and NLP traditionally focus on different evaluation methods. While HCI involves a small number of people directly and deeply, NLP traditionally relies on standardized benchmark evaluations that involve a larger number of people indirectly. We present five methodological proposals at the intersection of HCI and NLP and situate them in the context of ML-based NLP models. Our goal is to foster interdisciplinary collaboration and progress in both fields by emphasizing what the fields can learn from each other.
Unlike previous unknown nouns tagging task, this is the first attempt to focus on out-of-vocabulary (OOV) lexical evaluation tasks that do not require any prior knowledge. The OOV words are words that only appear in test samples. The goal of tasks is to provide solutions for OOV lexical classification and prediction. The tasks require annotators to conclude the attributes of the OOV words based on their related contexts. Then, we utilize unsupervised word embedding methods such as Word2Vec and Word2GM to perform the baseline experiments on the categorical classification task and OOV words attribute prediction tasks.