ترغب بنشر مسار تعليمي؟ اضغط هنا

Keep it Simple: Unsupervised Simplification of Multi-Paragraph Text

80   0   0.0 ( 0 )
 نشر من قبل Philippe Laban
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This work presents Keep it Simple (KiS), a new approach to unsupervised text simplification which learns to balance a reward across three properties: fluency, salience and simplicity. We train the model with a novel algorithm to optimize the reward (k-SCST), in which the model proposes several candidate simplifications, computes each candidates reward, and encourages candidates that outperform the mean reward. Finally, we propose a realistic text comprehension task as an evaluation method for text simplification. When tested on the English news domain, the KiS model outperforms strong supervised baselines by more than 4 SARI points, and can help people complete a comprehension task an average of 18% faster while retaining accuracy, when compared to the original text. Code available: https://github.com/tingofurro/keep_it_simple



قيم البحث

اقرأ أيضاً

We consider the problem of learning to simplify medical texts. This is important because most reliable, up-to-date information in biomedicine is dense with jargon and thus practically inaccessible to the lay audience. Furthermore, manual simplificati on does not scale to the rapidly growing body of biomedical literature, motivating the need for automated approaches. Unfortunately, there are no large-scale resources available for this task. In this work we introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics. We then propose a new metric based on likelihood scores from a masked language model pretrained on scientific texts. We show that this automated measure better differentiates between technical and lay summaries than existing heuristics. We introduce and evaluate baseline encoder-decoder Transformer models for simplification and propose a novel augmentation to these in which we explicitly penalize the decoder for producing jargon terms; we find that this yields improvements over baselines in terms of readability.
Text Simplification (TS) aims to reduce the linguistic complexity of content to make it easier to understand. Research in TS has been of keen interest, especially as approaches to TS have shifted from manual, hand-crafted rules to automated simplific ation. This survey seeks to provide a comprehensive overview of TS, including a brief description of earlier approaches used, discussion of various aspects of simplification (lexical, semantic and syntactic), and latest techniques being utilized in the field. We note that the research in the field has clearly shifted towards utilizing deep learning techniques to perform TS, with a specific focus on developing solutions to combat the lack of data available for simplification. We also include a discussion of datasets and evaluations metrics commonly used, along with discussion of related fields within Natural Language Processing (NLP), like semantic similarity.
Much of modern-day text simplification research focuses on sentence-level simplification, transforming original, more complex sentences into simplifie
This work improves monolingual sentence alignment for text simplification, specifically for text in standard and simple Wikipedia. We introduce a convolutional neural network structure to model similarity between two sentences. Due to the limitation of available parallel corpora, the model is trained in a semi-supervised way, by using the output of a knowledge-based high performance aligning system. We apply the resulting similarity score to rescore the knowledge-based output, and adapt the model by a small hand-aligned dataset. Experiments show that both rescoring and adaptation improve the performance of knowledge-based method.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا