تركز العمل الحالي على التحقيق في نماذج اللغة المحددة مسبقا (LMS) في الغالب على المهام الأساسية على مستوى الجملة.في هذه الورقة، نقدم إجراء خطاب على مستوى المستندات لتقييم قدرة LMS المسبقة على التقاط العلاقات على مستوى المستندات.نقوم بتجربة 7 LMS محددة مسبقا، 4 لغات، و 7 مهام قيد الخطاب، والعثور على بارت ليكون بشكل عام أفضل نموذج في التقاط الخطاب - - ولكن فقط في تشفيرها، مع بيرت أداء بشكل مفاجئ نموذج الأساس.عبر النماذج المختلفة، هناك اختلافات كبيرة في أفضل طبقات في التقاط معلومات خطاب، والتفاوتات الكبيرة بين النماذج.
Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse --- but only in its encoder, with BERT performing surprisingly well as the baseline model. Across the different models, there are substantial differences in which layers best capture discourse information, and large disparities between models.
References used
https://aclanthology.org/
To obtain high-quality sentence embeddings from pretrained language models (PLMs), they must either be augmented with additional pretraining objectives or finetuned on a large set of labeled text pairs. While the latter approach typically outperforms
Abstract Consistency of a model---that is, the invariance of its behavior under meaning-preserving alternations in its input---is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language
Paraphrase generation has benefited extensively from recent progress in the designing of training objectives and model architectures. However, previous explorations have largely focused on supervised methods, which require a large amount of labeled d
Pre-trained multilingual language models have become an important building block in multilingual Natural Language Processing. In the present paper, we investigate a range of such models to find out how well they transfer discourse-level knowledge acr
Taxonomies are symbolic representations of hierarchical relationships between terms or entities. While taxonomies are useful in broad applications, manually updating or maintaining them is labor-intensive and difficult to scale in practice. Conventio