عززت التطورات الأخيرة في توليد اللغة الطبيعية (NLG) الوسائط لصالح إعادة إدخال ترميز صريح من علاقات الخطاب في المدخلات إلى النماذج العصبية. في The Methodius Corpus، تمثيل معنى (MR) منظم هرمي ويشمل علاقات الخطاب. وفي الوقت نفسه، فقد تبين أن نماذج اللغة المدربة مسبقا مسبقا تشفير المعرفة اللغوية الغنية التي توفر موردا ممتازا ل NLG. بحكم توليف هذه الخطوط الأبحاث، نقوم بإجراء تجارب مكثفة بشأن فوائد استخدام النماذج المدربة مسبقا ومعلومات علاقة الخطاب في السيدة، مع التركيز على تحسين تماسك خطاب وتصحيحه. نعيد إعادة تصميم كوربوس المنهجية؛ ونحن أيضا بناء وجبة ثياب أخرى أخرى فيها السيدة غير هي منظم بشكل هرمي ولكنها مسطحة. نبلغ عن التجارب على إصدارات مختلفة من شركة Corga، التي تحقق عند، حيث تستفيد النماذج المدربة مسبقا من السيدة مع معلومات علاقة الخطاب فيها. نستنتج أن علاقات الخطاب تحسن بشكل كبير NLG عندما تكون البيانات محدودة.
Recent developments in natural language generation (NLG) have bolstered arguments in favor of re-introducing explicit coding of discourse relations in the input to neural models. In the Methodius corpus, a meaning representation (MR) is hierarchically structured and includes discourse relations. Meanwhile pre-trained language models have been shown to implicitly encode rich linguistic knowledge which provides an excellent resource for NLG. By virtue of synthesizing these lines of research, we conduct extensive experiments on the benefits of using pre-trained models and discourse relation information in MRs, focusing on the improvement of discourse coherence and correctness. We redesign the Methodius corpus; we also construct another Methodius corpus in which MRs are not hierarchically structured but flat. We report experiments on different versions of the corpora, which probe when, where, and how pre-trained models benefit from MRs with discourse relation information in them. We conclude that discourse relations significantly improve NLG when data is limited.
References used
https://aclanthology.org/
We present two novel unsupervised methods for eliminating toxicity in text. Our first method combines two recent ideas: (1) guidance of the generation process with small style-conditional language models and (2) use of paraphrasing models to perform
Commonsense reasoning benchmarks have been largely solved by fine-tuning language models. The downside is that fine-tuning may cause models to overfit to task-specific data and thereby forget their knowledge gained during pre-training. Recent works o
This paper investigates whether the power of the models pre-trained on text data, such as BERT, can be transferred to general token sequence classification applications. To verify pre-trained models' transferability, we test the pre-trained models on
Recently, fine-tuning pre-trained language models (e.g., multilingual BERT) to downstream cross-lingual tasks has shown promising results. However, the fine-tuning process inevitably changes the parameters of the pre-trained model and weakens its cro
Pre-trained language models have achieved huge success on a wide range of NLP tasks. However, contextual representations from pre-trained models contain entangled semantic and syntactic information, and therefore cannot be directly used to derive use