نقدم رواية من أعلى إلى أسفل صياغة نهاية إلى نهاية لخطاب مستوى الوثيقة في إطار نظرية الهيكل البوليكي (RST).في هذه الصياغة، نحن نفكر في تحليل الخطاب كتسلسل لتقسيم القرارات في حدود الرمز المميز واستخدام شبكة SEQ2SEQ لنموذج قرارات تقسيم.يسهل إطار عملنا تخليل الخطاب من نقطة الصفر دون الحاجة إلى تجزئة خطاب كشرط مسبق؛بدلا من ذلك، فإنه يسبب تجزئة كجزء من عملية التحليل.يعتمد نموذج التحليل الموحد لدينا بحث شعاع لفك تشفير أفضل هيكل الأشجار من خلال البحث من خلال مساحة من أشجار التسجيل العالية.مع تجارب واسعة على TreeBank Standard RST TreeBank، نوضح أن محللنا يتفوقون على الأساليب الموجودة بتهامش جيد في كل من التحليلات المتنامية والتحليل مع تجزئة الذهب.والأهم من ذلك، فإنه يفعل ذلك دون استخدام أي ميزات يدوية، مما يجعلها أسرع ويمكن تكيفها بسهولة مع لغات جديدة ومجالات.
We introduce a novel top-down end-to-end formulation of document level discourse parsing in the Rhetorical Structure Theory (RST) framework. In this formulation, we consider discourse parsing as a sequence of splitting decisions at token boundaries and use a seq2seq network to model the splitting decisions. Our framework facilitates discourse parsing from scratch without requiring discourse segmentation as a prerequisite; rather, it yields segmentation as part of the parsing process. Our unified parsing model adopts a beam search to decode the best tree structure by searching through a space of high scoring trees. With extensive experiments on the standard RST discourse treebank, we demonstrate that our parser outperforms existing methods by a good margin in both end-to-end parsing and parsing with gold segmentation. More importantly, it does so without using any handcrafted features, making it faster and easily adaptable to new languages and domains.
References used
https://aclanthology.org/
The MultiTraiNMT Erasmus+ project aims at developing an open innovative syllabus in neural machine translation (NMT) for language learners and translators as multilingual citizens. Machine translation is seen as a resource that can support citizens i
Most of the previous Rhetorical Structure Theory (RST) parsing methods are based on supervised learning such as neural networks, that require an annotated corpus of sufficient size and quality. However, the RST Discourse Treebank (RST-DT), the benchm
Semantic parsing aims at translating natural language (NL) utterances onto machine-interpretable programs, which can be executed against a real-world environment. The expensive annotation of utterance-program pairs has long been acknowledged as a maj
The availability of corpora has led to significant advances in training semantic parsers in English. Unfortunately, for languages other than English, annotated data is limited and so is the performance of the developed parsers. Recently, pretrained m
Most of the existing studies of language use in social media content have focused on the surface-level linguistic features (e.g., function words and punctuation marks) and the semantic level aspects (e.g., the topics, sentiment, and emotions) of the