نبلغ عن التجارب في تبسيط النص التلقائي (ATS) للألمانية مع مستويات تبسيط متعددة على طول الإطار الأوروبي المشترك المرجعي لغات (CEFR)، وتبسيط اللغة الألمانية القياسية إلى مستويات A1 و A2 و B1.لهذا الغرض، نحقق في استخدام تسميات المصدر وإحاطاء اللغة الألمانية القياسية، مما يسمح لنا بتبسيط اللغة القياسية إلى مستوى CEFR محدد.نظرا لأن هذه الأساليب فعالة بشكل خاص في سيناريوهات الموارد المنخفضة، حيث يمكننا أن نتفوق على خط الأساس المحول القياسي.علاوة على ذلك، نقدم نسخ ملصقات، والتي نظهرها يمكن أن تساعد النموذج في إجراء تمييز بين الجمل التي تتطلب مزيدا من التعديلات والجمل التي يمكن نسخها كما هو.
We report on experiments in automatic text simplification (ATS) for German with multiple simplification levels along the Common European Framework of Reference for Languages (CEFR), simplifying standard German into levels A1, A2 and B1. For that purpose, we investigate the use of source labels and pretraining on standard German, allowing us to simplify standard language to a specific CEFR level. We show that these approaches are especially effective in low-resource scenarios, where we are able to outperform a standard transformer baseline. Moreover, we introduce copy labels, which we show can help the model make a distinction between sentences that require further modifications and sentences that can be copied as-is.
References used
https://aclanthology.org/
The task of document-level text simplification is very similar to summarization with the additional difficulty of reducing complexity. We introduce a newly collected data set of German texts, collected from the Swiss news magazine 20 Minuten (20 Minu
Due to efficient end-to-end training and fluency in generated texts, several encoder-decoder framework-based models are recently proposed for data-to-text generations. Appropriate encoding of input data is a crucial part of such encoder-decoder model
Document-level event extraction is critical to various natural language processing tasks for providing structured information. Existing approaches by sequential modeling neglect the complex logic structures for long texts. In this paper, we leverage
Text simplification is a valuable technique. However, current research is limited to sentence simplification. In this paper, we define and investigate a new task of document-level text simplification, which aims to simplify a document consisting of m
After the global financial crisis and the subsequent sovereign debt crisis that hit European economies in 2010, the German economy was able to recover faster than the others, achieving what has been called the "second German miracle". While some rese