Do you want to publish a course? Click here

Asking Crowdworkers to Write Entailment Examples: The Best of Bad Options

199   0   0.0 ( 0 )
 Added by Clara Vania
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

Large-scale natural language inference (NLI) datasets such as SNLI or MNLI have been created by asking crowdworkers to read a premise and write three new hypotheses, one for each possible semantic relationships (entailment, contradiction, and neutral). While this protocol has been used to create useful benchmark data, it remains unclear whether the writing-based annotation protocol is optimal for any purpose, since it has not been evaluated directly. Furthermore, there is ample evidence that crowdworker writing can introduce artifacts in the data. We investigate two alternative protocols which automatically create candidate (premise, hypothesis) pairs for annotators to label. Using these protocols and a writing-based baseline, we collect several new English NLI datasets of over 3k examples each, each using a fixed amount of annotator time, but a varying number of examples to fit that time budget. Our experiments on NLI and transfer learning show negative results: None of the alternative protocols outperforms the baseline in evaluations of generalization within NLI or on transfer to outside target tasks. We conclude that crowdworker writing still the best known option for entailment data, highlighting the need for further data collection work to focus on improving writing-based annotation processes.



rate research

Read More

This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint. It introduces a novel method that encourages the inclusion of key terms from the original document into the summary: key terms are masked out of the original document and must be filled in by a coverage model using the current generated summary. A novel unsupervised training procedure leverages this coverage model along with a fluency model to generate and score summaries. When tested on popular news summarization datasets, the method outperforms previous unsupervised methods by more than 2 R-1 points, and approaches results of competitive supervised methods. Our model attains higher levels of abstraction with copied passages roughly two times shorter than prior work, and learns to compress and merge sentences without supervision.
In this paper, we propose QACE, a new metric based on Question Answering for Caption Evaluation. QACE generates questions on the evaluated caption and checks its content by asking the questions on either the reference caption or the source image. We first develop QACE-Ref that compares the answers of the evaluated caption to its reference, and report competitive results with the state-of-the-art metrics. To go further, we propose QACE-Img, which asks the questions directly on the image, instead of reference. A Visual-QA system is necessary for QACE-Img. Unfortunately, the standard VQA models are framed as a classification among only a few thousand categories. Instead, we propose Visual-T5, an abstractive VQA system. The resulting metric, QACE-Img is multi-modal, reference-less, and explainable. Our experiments show that QACE-Img compares favorably w.r.t. other reference-less metrics. We will release the pre-trained models to compute QACE.
Automatically writing stylized Chinese characters is an attractive yet challenging task due to its wide applicabilities. In this paper, we propose a novel framework named Style-Aware Variational Auto-Encoder (SA-VAE) to flexibly generate Chinese characters. Specifically, we propose to capture the different characteristics of a Chinese character by disentangling the latent features into content-related and style-related components. Considering of the complex shapes and structures, we incorporate the structure information as prior knowledge into our framework to guide the generation. Our framework shows a powerful one-shot/low-shot generalization ability by inferring the style component given a character with unseen style. To the best of our knowledge, this is the first attempt to learn to write new-style Chinese characters by observing only one or a few examples. Extensive experiments demonstrate its effectiveness in generating different stylized Chinese characters by fusing the feature vectors corresponding to different contents and styles, which is of significant importance in real-world applications.
Many enhanced sampling techniques rely on the identification of a number of collective variables that describe all the slow modes of the system. By constructing a bias potential in this reduced space one is then able to sample efficiently and reconstruct the free energy landscape. In methods like metadynamics, the quality of these collective variables plays a key role in convergence efficiency. Unfortunately in many systems of interest it is not possible to identify an optimal collective variable, and one must deal with the non-ideal situation of a system in which some slow modes are not accelerated. We propose a two-step approach in which, by taking into account the residual multiscale nature of the problem, one is able to significantly speed up convergence. To do so, we combine an exploratory metadynamics run with an optimization of the free energy difference between metastable states, based on the recently proposed variationally enhanced sampling method. This new method is well parallelizable and is especially suited for complex systems, because of its simplicity and clear underlying physical picture.
216 - Li Liu , Mengge He , Guanghui Xu 2021
Reading and writing research papers is one of the most privileged abilities that a qualified researcher should master. However, it is difficult for new researchers (eg{students}) to fully {grasp} this ability. It would be fascinating if we could train an intelligent agent to help people read and summarize papers, and perhaps even discover and exploit the potential knowledge clues to write novel papers. Although there have been existing works focusing on summarizing (emph{i.e.}, reading) the knowledge in a given text or generating (emph{i.e.}, writing) a text based on the given knowledge, the ability of simultaneously reading and writing is still under development. Typically, this requires an agent to fully understand the knowledge from the given text materials and generate correct and fluent novel paragraphs, which is very challenging in practice. In this paper, we propose a Deep ReAder-Writer (DRAW) network, which consists of a textit{Reader} that can extract knowledge graphs (KGs) from input paragraphs and discover potential knowledge, a graph-to-text textit{Writer} that generates a novel paragraph, and a textit{Reviewer} that reviews the generated paragraph from three different aspects. Extensive experiments show that our DRAW network outperforms considered baselines and several state-of-the-art methods on AGENDA and M-AGENDA datasets. Our code and supplementary are released at https://github.com/menggehe/DRAW.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا