New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Improving Neural RST Parsing Model with Silver Agreement Subtrees

تحسين نموذج التحليل العصبي RST مع اتفاقية الفضة

370 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

شبكة تتبع السياق صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

تستند معظم طرق تحليل البنية الخطابية السابقة (RST) إلى التعلم الخاضع للإشراف مثل الشبكات العصبية، والتي تتطلب وجعة مشروح من الحجم والجودة الكافية. ومع ذلك، فإن Treebank Treebank RST RST (RST-DT)، والجورباس القياسي للحل الصادر باللغة الإنجليزية، وهو صغير بسبب التعليق التوضيحي بشكل مكلف للأشجار الأولى. عدم وجود بيانات تدريبية كبيرة مشروحة تسبب أداء ضعيف خاصة في العلامات المتعلقة بالعلامات. لذلك، نقترح طريقة لتحسين نماذج التحليل العصبي RST من خلال استغلال البيانات الفضية، أي البيانات المشروحة تلقائيا. نقوم بإنشاء بيانات فضية واسعة النطاق من Corpus غير المستمر باستخدام محلل دائري للحكومة الأولى. للحصول على بيانات فضية عالية الجودة، نستخلص من الاتفاقية من الأشجار الأولى للوثائق التي تم بناؤها باستخدام المحللين RST. بعد ذلك، قم بتدريب المحلل الوراثي العصبي مع البيانات الفضية التي تم الحصول عليها وضبطها بشكل جيد على RST-DT. تظهر النتائج التجريبية أن طريقتنا حققت أفضل درجات Micro-F1 للأرضيات القومية والعلاقة عند 75.0 و 63.2 على التوالي. علاوة على ذلك، حصلنا على مكاسب ملحوظة في درجة العلاقة، 3.0 نقطة، ضد المحللين السابقين من الحديثة.

Most of the previous Rhetorical Structure Theory (RST) parsing methods are based on supervised learning such as neural networks, that require an annotated corpus of sufficient size and quality. However, the RST Discourse Treebank (RST-DT), the benchmark corpus for RST parsing in English, is small due to the costly annotation of RST trees. The lack of large annotated training data causes poor performance especially in relation labeling. Therefore, we propose a method for improving neural RST parsing models by exploiting silver data, i.e., automatically annotated data. We create large-scale silver data from an unlabeled corpus by using a state-of-the-art RST parser. To obtain high-quality silver data, we extract agreement subtrees from RST trees for documents built using the RST parsers. We then pre-train a neural RST parser with the obtained silver data and fine-tune it on the RST-DT. Experimental results show that our method achieved the best micro-F1 scores for Nuclearity and Relation at 75.0 and 63.2, respectively. Furthermore, we obtained a remarkable gain in the Relation score, 3.0 points, against the previous state-of-the-art parser.

References used

https://aclanthology.org/

rate research

WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER

184 - Association for Computation Linguistics 2021 مقالة

Multilingual Named Entity Recognition (NER) is a key intermediate task which is needed in many areas of NLP. In this paper, we address the well-known issue of data scarcity in NER, especially relevant when moving to a multilingual scenario, and go be yond current approaches to the creation of multilingual silver data for the task. We exploit the texts of Wikipedia and introduce a new methodology based on the effective combination of knowledge-based approaches and neural models, together with a novel domain adaptation technique, to produce high-quality training corpora for NER. We evaluate our datasets extensively on standard benchmarks for NER, yielding substantial improvements up to 6 span-based F1-score points over previous state-of-the-art systems for data creation.

الاستفادة المتعددة combined neural مجتمعة العصبية صناعة حمض الفوسفور

RST Parsing from Scratch

349 - Association for Computation Linguistics 2021 مقالة

We introduce a novel top-down end-to-end formulation of document level discourse parsing in the Rhetorical Structure Theory (RST) framework. In this formulation, we consider discourse parsing as a sequence of splitting decisions at token boundaries a nd use a seq2seq network to model the splitting decisions. Our framework facilitates discourse parsing from scratch without requiring discourse segmentation as a prerequisite; rather, it yields segmentation as part of the parsing process. Our unified parsing model adopts a beam search to decode the best tree structure by searching through a space of high scoring trees. With extensive experiments on the standard RST discourse treebank, we demonstrate that our parser outperforms existing methods by a good margin in both end-to-end parsing and parsing with gold segmentation. More importantly, it does so without using any handcrafted features, making it faster and easily adaptable to new languages and domains.

شبكة تتبع السياق rst صمم صناعة حمض الفوسفور

Context Tracking Network: Graph-based Context Modeling for Implicit Discourse Relation Recognition

395 - Association for Computation Linguistics 2021 مقالة

Implicit discourse relation recognition (IDRR) aims to identify logical relations between two adjacent sentences in the discourse. Existing models fail to fully utilize the contextual information which plays an important role in interpreting each loc al sentence. In this paper, we thus propose a novel graph-based Context Tracking Network (CT-Net) to model the discourse context for IDRR. The CT-Net firstly converts the discourse into the paragraph association graph (PAG), where each sentence tracks their closely related context from the intricate discourse through different types of edges. Then, the CT-Net extracts contextual representation from the PAG through a specially designed cross-grained updating mechanism, which can effectively integrate both sentence-level and token-level contextual semantics. Experiments on PDTB 2.0 show that the CT-Net gains better performance than models that roughly model the context.

نماذج حل النماذج context tracking network شبكة تتبع السياق صناعة حمض الفوسفور

Proof Net Structure for Neural Lambek Categorial Parsing

261 - Association for Computation Linguistics 2021 مقالة

In this paper, we present the first statistical parser for Lambek categorial grammar (LCG), a grammatical formalism for which the graphical proof method known as *proof nets* is applicable. Our parser incorporates proof net structure and constraints into a system based on self-attention networks via novel model elements. Our experiments on an English LCG corpus show that incorporating term graph structure is helpful to the model, improving both parsing accuracy and coverage. Moreover, we derive novel loss functions by expressing proof net constraints as differentiable functions of our model output, enabling us to train our parser without ground-truth derivations.

neural lambek categorial neural lambek lambek categorial parsing Lambek التصنيف العصبي لامبيك العصبي Lambek التحليل الصلب صناعة حمض الفوسفور المزيد..

Stacked AMR Parsing with Silver Data

258 - Association for Computation Linguistics 2021 مقالة

Lacking sufficient human-annotated data is one main challenge for abstract meaning representation (AMR) parsing. To alleviate this problem, previous works usually make use of silver data or pre-trained language models. In particular, one recent seq-t o-seq work directly fine-tunes AMR graph sequences on the encoder-decoder pre-trained language model and achieves new state-of-the-art results, outperforming previous works by a large margin. However, it makes the decoding relatively slower. In this work, we investigate alternative approaches to achieve competitive performance at faster speeds. We propose a simplified AMR parser and a pre-training technique for the effective usage of silver data. We conduct extensive experiments on the widely used AMR2.0 dataset and the results demonstrate that our Transformer-based AMR parser achieves the best performance among the seq2graph-based models. Furthermore, with silver data, our model achieves competitive results with the SOTA model, and the speed is an order of magnitude faster. Detailed analyses are conducted to gain more insights into our proposed model and the effectiveness of the pre-training technique.

stacked amr parsing silver data stacked amr تحليل عمرو مكدسة البيانات الفضية مكدسة عمرو صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Improving Neural RST Parsing Model with Silver Agreement Subtrees

تحسين نموذج التحليل العصبي RST مع اتفاقية الفضة

Ask ChatGPT about the research

Read More

suggested questions