ﻻ يوجد ملخص باللغة العربية
Knowledge tracing (KT) defines the task of predicting whether students can correctly answer questions based on their historical response. Although much research has been devoted to exploiting the question information, plentiful advanced information among questions and skills hasnt been well extracted, making it challenging for previous work to perform adequately. In this paper, we demonstrate that large gains on KT can be realized by pre-training embeddings for each question on abundant side information, followed by training deep KT models on the obtained embeddings. To be specific, the side information includes question difficulty and three kinds of relations contained in a bipartite graph between questions and skills. To pre-train the question embeddings, we propose to use product-based neural networks to recover the side information. As a result, adopting the pre-trained embeddings in existing deep KT models significantly outperforms state-of-the-art baselines on three common KT datasets.
Recommender systems aim to provide item recommendations for users, and are usually faced with data sparsity problem (e.g., cold start) in real-world scenarios. Recently pre-trained models have shown their effectiveness in knowledge transfer between d
Neural network models usually suffer from the challenge of incorporating commonsense knowledge into the open-domain dialogue systems. In this paper, we propose a novel knowledge-aware dialogue generation model (called TransDG), which transfers questi
Answering natural language questions over tables is usually seen as a semantic parsing task. To alleviate the collection cost of full logical forms, one popular approach focuses on weak supervision consisting of denotations instead of logical forms.
While recent research on natural language inference has considerably benefited from large annotated datasets, the amount of inference-related knowledge (including commonsense) provided in the annotated data is still rather limited. There have been tw
End-to-end models in NLP rarely encode external world knowledge about length of time. We introduce two effective models for duration prediction, which incorporate external knowledge by reading temporal-related news sentences (time-aware pre-training)