Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

FanfictionNLP: A Text Processing Pipeline for Fanfiction

fanfictionnlp: خط أنابيب معالجة النص للقبض

782 0 0 0.0 ( 0 )

Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Fanfiction presents an opportunity as a data source for research in NLP, education, and social science. However, answering specific research questions with this data is difficult, since fanfiction contains more diverse writing styles than formal fiction. We present a text processing pipeline for fanfiction, with a focus on identifying text associated with characters. The pipeline includes modules for character identification and coreference, as well as the attribution of quotes and narration to those characters. Additionally, the pipeline contains a novel approach to character coreference that uses knowledge from quote attribution to resolve pronouns within quotes. For each module, we evaluate the effectiveness of various approaches on 10 annotated fanfiction stories. This pipeline outperforms tools developed for formal fiction on the tasks of character coreference and quote attribution

References used

https://aclanthology.org/

rate research

EventPlus: A Temporal Event Understanding Pipeline

1035 - Association for Computation Linguistics 2021 مقالة

We present EventPlus, a temporal event understanding pipeline that integrates various state-of-the-art event understanding components including event trigger and type detection, event argument detection, event duration and temporal relation extractio n. Event information, especially event temporal knowledge, is a type of common sense knowledge that helps people understand how stories evolve and provides predictive hints for future events. EventPlus as the first comprehensive temporal event understanding pipeline provides a convenient tool for users to quickly obtain annotations about events and their temporal information for any user-provided document. Furthermore, we show EventPlus can be easily adapted to other domains (e.g., biomedical domain). We make EventPlus publicly available to facilitate event-related information extraction and downstream applications.

event understanding pipeline temporal event understanding understanding pipeline حدث فهم خط أنابيب فهم الحدث الزمني فهم خط أنابيب صناعة حمض الفوسفور المزيد..

Split-and-Rephrase in a Cross-Lingual Manner: A Complete Pipeline

809 - Association for Computation Linguistics 2021 مقالة

Split-and-rephrase is a challenging task that promotes the transformation of a given complex input sentence into multiple shorter sentences retaining equivalent meaning. This rewriting approach conceptualizes that shorter sentences benefit human read ers and improve NLP downstream tasks attending as a preprocessing step. This work presents a complete pipeline capable of performing the split-and-rephrase method in a cross-lingual manner. We trained sequence-to-sequence neural models as from English corpora and applied them to predict the transformations in English and Brazilian Portuguese sentences jointly with BERT's masked language modeling. Contrary to traditional approaches that seek training models with extensive vocabularies, we present a non-trivial way to construct symbolic ones generalized solely by grammatical classes (POS tags) and their respective recurrences, reducing the amount of necessary training data. This pipeline contribution showed competitive results encouraging the expansion of the method to languages other than English.

cross-lingual manner complete pipeline brazilian portuguese sentences بطريقة تبادل خط أنابيب كاملة الجمل البرتغالية البرازيلية صناعة حمض الفوسفور المزيد..

MIPE: A Metric Independent Pipeline for Effective Code-Mixed NLG Evaluation

1048 - Association for Computation Linguistics 2021 مقالة

Code-mixing is a phenomenon of mixing words and phrases from two or more languages in a single utterance of speech and text. Due to the high linguistic diversity, code-mixing presents several challenges in evaluating standard natural language generat ion (NLG) tasks. Various widely popular metrics perform poorly with the code-mixed NLG tasks. To address this challenge, we present a metric in- dependent evaluation pipeline MIPE that significantly improves the correlation between evaluation metrics and human judgments on the generated code-mixed text. As a use case, we demonstrate the performance of MIPE on the machine-generated Hinglish (code-mixing of Hindi and English languages) sentences from the HinGE corpus. We can extend the proposed evaluation strategy to other code-mixed language pairs, NLG tasks, and evaluation metrics with minimal to no effort.

effective code-mixed nlg metric independent pipeline independent pipeline كود فعال مختلط NLG خط أنابيب مستقلة متري خط أنابيب مستقلة صناعة حمض الفوسفور المزيد..

VUS at IWSLT 2021: A Finetuned Pipeline for Offline Speech Translation

579 - Association for Computation Linguistics 2021 مقالة

In this technical report, we describe the fine-tuned ASR-MT pipeline used for the IWSLT shared task. We remove less useful speech samples by checking WER with an ASR model, and further train a wav2vec and Transformers-based ASR module based on the fi ltered data. In addition, we cleanse the errata that can interfere with the machine translation process and use it for Transformer-based MT module training. Finally, in the actual inference phase, we use a sentence boundary detection model trained with constrained data to properly merge fragment ASR outputs into full sentences. The merged sentences are post-processed using part of speech. The final result is yielded by the trained MT module. The performance using the dev set displays BLEU 20.37, and this model records the performance of BLEU 20.9 with the test set.

خطاب غير متصل finetuned pipeline خط أنابيب Finetuned. صناعة حمض الفوسفور

Builder, we have done it: Evaluating \& Extending Dialogue-AMR NLU Pipeline for Two Collaborative Domains

487 - Association for Computation Linguistics 2021 مقالة

We adopt, evaluate, and improve upon a two-step natural language understanding (NLU) pipeline that incrementally tames the variation of unconstrained natural language input and maps to executable robot behaviors. The pipeline first leverages Abstract Meaning Representation (AMR) parsing to capture the propositional content of the utterance, and second converts this into Dialogue-AMR,'' which augments standard AMR with information on tense, aspect, and speech acts. Several alternative approaches and training datasets are evaluated for both steps and corresponding components of the pipeline, some of which outperform the original. We extend the Dialogue-AMR annotation schema to cover a different collaborative instruction domain and evaluate on both domains. With very little training data, we achieve promising performance in the new domain, demonstrating the scalability of this approach.

extending dialogue-amr nlu dialogue-amr nlu pipeline تمديد الحوار-عمرو NLU الحوار-عمرو نلو خط أنابيب صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

FanfictionNLP: A Text Processing Pipeline for Fanfiction

fanfictionnlp: خط أنابيب معالجة النص للقبض

Ask ChatGPT about the research

Read More

suggested questions