Research papers, master and doctoral theses about text simplification

A New Dataset and Efficient Baselines for Document-level Text Simplification in German

324 - Association for Computation Linguistics 2021 مقالة

The task of document-level text simplification is very similar to summarization with the additional difficulty of reducing complexity. We introduce a newly collected data set of German texts, collected from the Swiss news magazine 20 Minuten (20 Minu tes') that consists of full articles paired with simplified summaries. Furthermore, we present experiments on automatic text simplification with the pretrained multilingual mBART and a modified version thereof that is more memory-friendly, using both our new data set and existing simplification corpora. Our modifications of mBART let us train at a lower memory cost without much loss in performance, in fact, the smaller mBART even improves over the standard model in a setting with multiple simplification levels.

dataset and efficient efficient baselines document-level text simplification DataSet وفعال خطوط أساس فعالة تبسيط نص المستند صناعة حمض الفوسفور المزيد..

Exploring German Multi-Level Text Simplification

222 - Association for Computation Linguistics 2021 مقالة

We report on experiments in automatic text simplification (ATS) for German with multiple simplification levels along the Common European Framework of Reference for Languages (CEFR), simplifying standard German into levels A1, A2 and B1. For that purp ose, we investigate the use of source labels and pretraining on standard German, allowing us to simplify standard language to a specific CEFR level. We show that these approaches are especially effective in low-resource scenarios, where we are able to outperform a standard transformer baseline. Moreover, we introduce copy labels, which we show can help the model make a distinction between sentences that require further modifications and sentences that can be copied as-is.

exploring german multi-level german multi-level text multi-level text simplification استكشاف الألماني متعدد المستويات النص الألماني متعدد المستويات تبسيط النص متعدد المستويات صناعة حمض الفوسفور المزيد..

Flesch-Kincaid is Not a Text Simplification Evaluation Metric

169 - Association for Computation Linguistics 2021 مقالة

Sentence-level text simplification is currently evaluated using both automated metrics and human evaluation. For automatic evaluation, a combination of metrics is usually employed to evaluate different aspects of the simplification. Flesch-Kincaid Gr ade Level (FKGL) is one metric that has been regularly used to measure the readability of system output. In this paper, we argue that FKGL should not be used to evaluate text simplification systems. We provide experimental analyses on recent system output showing that the FKGL score can easily be manipulated to improve the score dramatically with only minor impact on other automated metrics (BLEU and SARI). Instead of using FKGL, we suggest that the component statistics, along with others, be used for posthoc analysis to understand system behavior.

تحديد الألمانية text simplification evaluation flesch-kincaid grade level تقييم النص التقييم مستوى الصف Flesch-Kincaid صناعة حمض الفوسفور

What Makes a Concept Complex? Measuring Conceptual Complexity as a Precursor for Text Simplification

180 - Association for Computation Linguistics 2021 مقالة

Advancements within the field of text simplification (TS) have primarily been within syntactic or lexical simplification. However, conceptual simplification has previously been identified as another field of TS that has the potential to significantly improve reading comprehension. A first step to measuring conceptual simplification is the classification of concepts as either complex or simple. This research-in-progress paper proposes a new definition of conceptual complexity alongside a simple machine-learning approach that performs a binary classification task to distinguish between simple and complex concepts. It is proposed that this be a first step when developing new text simplification models that operate on a conceptual level.

text simplification simplification makes تبسيط النص تبسيط يصنع صناعة حمض الفوسفور المزيد..

Controllable Text Simplification with Explicit Paraphrasing

533 - Association for Computation Linguistics 2021 مقالة

Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting. Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously. However, such systems limit themselves to mostly deleting words and cannot easily adapt to the requirements of different target audiences. In this paper, we propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles. We introduce a new data augmentation method to improve the paraphrasing capability of our model. Through automatic and manual evaluations, we show that our proposed model establishes a new state-of-the-art for the task, paraphrasing more often than the existing systems, and can control the degree of each simplification operation applied to the input texts.

controllable text simplification explicit paraphrasing تبسيط النص الذي يمكن السيطرة عليه إعادة صياغة صريحة صناعة حمض الفوسفور

Improving Human Text Simplification with Sentence Fusion

284 - Association for Computation Linguistics 2021 مقالة

The quality of fully automated text simplification systems is not good enough for use in real-world settings; instead, human simplifications are used. In this paper, we examine how to improve the cost and quality of human simplifications by leveragin g crowdsourcing. We introduce a graph-based sentence fusion approach to augment human simplifications and a reranking approach to both select high quality simplifications and to allow for targeting simplifications with varying levels of simplicity. Using the Newsela dataset (Xu et al., 2015) we show consistent improvements over experts at varying simplification levels and find that the additional sentence fusion simplifications allow for simpler output than the human simplifications alone.

improving human text human text simplification human simplifications تحسين النص البشري تبسيط النص البشري التبسيط البشري صناعة حمض الفوسفور المزيد..

Finding Spoiler Bias in Tweets by Zero-shot Learning and Knowledge Distilling from Neural Text Simplification

167 - Association for Computation Linguistics 2021 مقالة

Automatic detection of critical plot information in reviews of media items poses unique challenges to both social computing and computational linguistics. In this paper we propose to cast the problem of discovering spoiler bias in online discourse as a text simplification task. We conjecture that for an item-user pair, the simpler the user review we learn from an item summary the higher its likelihood to present a spoiler. Our neural model incorporates the advanced transformer network to rank the severity of a spoiler in user tweets. We constructed a sustainable high-quality movie dataset scraped from unsolicited review tweets and paired with a title summary and meta-data extracted from a movie specific domain. To a large extent, our quantitative and qualitative results weigh in on the performance impact of named entity presence in plot summaries. Pretrained on a split-and-rephrase corpus with knowledge distilled from English Wikipedia and fine-tuned on our movie dataset, our neural model shows to outperform both a language modeler and monolingual translation baselines.

أداة للرصد finding spoiler bias neural text simplification العثور على التحيز المفسد مبسط النص العصبي صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد