سرية أو صعوبة تقدير الكلمات والمستندات قد تم التحقيق بشكل مستقل في الأدب، غالبا ما يفترض وجود موارد مشروحة مكثفة للآخر.بدافع تحليلنا الذي يظهر أن هناك علاقة متكررة بين Word ووثيقة صعوبة، نقترح بشكل مشترك بين الكلمة والوثائق صعوبة من خلال شبكة اتصال بيانية (GCN) في أزياء شبه مشهورة.تكشف نتائجنا التجريبية أن الأسلوب القائم على GCN يمكن أن يحقق دقة أعلى من خطوط الأساس القوية، ويبقى قويا حتى مع كمية أصغر من البيانات المسمى.
Readability or difficulty estimation of words and documents has been investigated independently in the literature, often assuming the existence of extensive annotated resources for the other. Motivated by our analysis showing that there is a recursive relationship between word and document difficulty, we propose to jointly estimate word and document difficulty through a graph convolutional network (GCN) in a semi-supervised fashion. Our experimental results reveal that the GCN-based method can achieve higher accuracy than strong baselines, and stays robust even with a smaller amount of labeled data.
References used
https://aclanthology.org/
Acquisition of multilingual training data continues to be a challenge in word sense disambiguation (WSD). To address this problem, unsupervised approaches have been proposed to automatically generate sense annotations for training supervised WSD syst
Weakly-supervised text classification aims to induce text classifiers from only a few user-provided seed words. The vast majority of previous work assumes high-quality seed words are given. However, the expert-annotated seed words are sometimes non-t
Toxic comments contain forms of non-acceptable language targeted towards groups or individuals. These types of comments become a serious concern for government organizations, online communities, and social media platforms. Although there are some app
This paper presents a production Semi-Supervised Learning (SSL) pipeline based on the student-teacher framework, which leverages millions of unlabeled examples to improve Natural Language Understanding (NLU) tasks. We investigate two questions relate
To alleviate human efforts from obtaining large-scale annotations, Semi-Supervised Relation Extraction methods aim to leverage unlabeled data in addition to learning from limited samples. Existing self-training methods suffer from the gradual drift p