Reference-less Quality Estimation of Text Simplification Systems

65 0 0.0 ( 0 )

Download Cite

Added by Louis Martin

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Louis Martin

Computation and Language

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The evaluation of text simplification (TS) systems remains an open challenge. As the task has common points with machine translation (MT), TS is often evaluated using MT metrics such as BLEU. However, such metrics require high quality reference data, which is rarely available for TS. TS has the advantage over MT of being a monolingual task, which allows for direct comparisons to be made between the simplified text and its original version. In this paper, we compare multiple approaches to reference-less quality estimation of sentence-level text simplification systems, based on the dataset used for the QATS 2016 shared task. We distinguish three different dimensions: gram-maticality, meaning preservation and simplicity. We show that n-gram-based MT metrics such as BLEU and METEOR correlate the most with human judgment of grammaticality and meaning preservation, whereas simplicity is best evaluated by basic length-based metrics.

rate research

Simple-QE: Better Automatic Quality Estimation for Text Simplification

120 - Reno Kriz , Marianna Apidianaki , Chris Callison-Burch 2020

Text simplification systems genera

Computation and Language

A Survey on Text Simplification

73 - Punardeep Sikka , Vijay Mago 2020

Text Simplification (TS) aims to reduce the linguistic complexity of content to make it easier to understand. Research in TS has been of keen interest, especially as approaches to TS have shifted from manual, hand-crafted rules to automated simplification. This survey seeks to provide a comprehensive overview of TS, including a brief description of earlier approaches used, discussion of various aspects of simplification (lexical, semantic and syntactic), and latest techniques being utilized in the field. We note that the research in the field has clearly shifted towards utilizing deep learning techniques to perform TS, with a specific focus on developing solutions to combat the lack of data available for simplification. We also include a discussion of datasets and evaluations metrics commonly used, along with discussion of related fields within Natural Language Processing (NLP), like semantic similarity.

Computation and Language

Elaborative Simplification: Content Addition and Explanation Generation in Text Simplification

74 - Neha Srikanth , Junyi Jessy Li 2020

Much of modern-day text simplification research focuses on sentence-level simplification, transforming original, more complex sentences into simplifie

Computation and Language

Monolingual sentence matching for text simplification

96 - Yonghui Huang , Yunhui Li , Yi Luan 2018

This work improves monolingual sentence alignment for text simplification, specifically for text in standard and simple Wikipedia. We introduce a convolutional neural network structure to model similarity between two sentences. Due to the limitation of available parallel corpora, the model is trained in a semi-supervised way, by using the output of a knowledge-based high performance aligning system. We apply the resulting similarity score to rescore the knowledge-based output, and adapt the model by a small hand-aligned dataset. Experiments show that both rescoring and adaptation improve the performance of knowledge-based method.

Computation and Language

Controllable Text Simplification with Explicit Paraphrasing

87 - Mounica Maddela , Fernando Alva-Manchego , Wei Xu 2020

Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting. Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously. However, such systems limit themselves to mostly deleting words and cannot easily adapt to the requirements of different target audiences. In this paper, we propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles. We introduce a new data augmentation method to improve the paraphrasing capability of our model. Through automatic and manual evaluations, we show that our proposed model establishes a new state-of-the-art for the task, paraphrasing more often than the existing systems, and can control the degree of each simplification operation applied to the input texts.

Computation and Language