Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

StreamHover: Livestream Transcript Summarization and Annotation

Streamhover: ملخص ترتيب Livestream والشروح

356 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

With the explosive growth of livestream broadcasting, there is an urgent need for new summarization technology that enables us to create a preview of streamed content and tap into this wealth of knowledge. However, the problem is nontrivial due to the informal nature of spoken language. Further, there has been a shortage of annotated datasets that are necessary for transcript summarization. In this paper, we present StreamHover, a framework for annotating and summarizing livestream transcripts. With a total of over 500 hours of videos annotated with both extractive and abstractive summaries, our benchmark dataset is significantly larger than currently existing annotated corpora. We explore a neural extractive summarization model that leverages vector-quantized variational autoencoder to learn latent vector representations of spoken utterances and identify salient utterances from the transcripts to form summaries. We show that our model generalizes better and improves performance over strong baselines. The results of this study provide an avenue for future research to improve summarization solutions for efficient browsing of livestreams.

References used

https://aclanthology.org/

rate research

TWEETSUMM - A Dialog Summarization Dataset for Customer Service

920 - Association for Computation Linguistics 2021 مقالة

In a typical customer service chat scenario, customers contact a support center to ask for help or raise complaints, and human agents try to solve the issues. In most cases, at the end of the conversation, agents are asked to write a short summary em phasizing the problem and the proposed solution, usually for the benefit of other agents that may have to deal with the same customer or issue. The goal of the present article is advancing the automation of this task. We introduce the first large scale, high quality, customer care dialog summarization dataset with close to 6500 human annotated summaries. The data is based on real-world customer support dialogs and includes both extractive and abstractive summaries. We also introduce a new unsupervised, extractive summarization method specific to dialogs.

dialog summarization dataset typical customer service خدمة العملاء النموذجية صناعة حمض الفوسفور

Hie-BART: Document Summarization with Hierarchical BART

581 - Association for Computation Linguistics 2021 مقالة

This paper proposes a new abstractive document summarization model, hierarchical BART (Hie-BART), which captures hierarchical structures of a document (i.e., sentence-word structures) in the BART model. Although the existing BART model has achieved a state-of-the-art performance on document summarization tasks, the model does not have the interactions between sentence-level information and word-level information. In machine translation tasks, the performance of neural machine translation models has been improved by incorporating multi-granularity self-attention (MG-SA), which captures the relationships between words and phrases. Inspired by the previous work, the proposed Hie-BART model incorporates MG-SA into the encoder of the BART model for capturing sentence-word structures. Evaluations on the CNN/Daily Mail dataset show that the proposed Hie-BART model outperforms some strong baselines and improves the performance of a non-hierarchical BART model (+0.23 ROUGE-L).

hierarchical bart bart model هرمي بارت نموذج بارت صناعة حمض الفوسفور

Improving Abstractive Dialogue Summarization with Hierarchical Pretraining and Topic Segment

805 - Association for Computation Linguistics 2021 مقالة

With the increasing abundance of meeting transcripts, meeting summary has attracted more and more attention from researchers. The unsupervised pre-training method based on transformer structure combined with fine-tuning of downstream tasks has achiev ed great success in the field of text summarization. However, the semantic structure and style of meeting transcripts are quite different from that of articles. In this work, we propose a hierarchical transformer encoder-decoder network with multi-task pre-training. Specifically, we mask key sentences at the word-level encoder and generate them at the decoder. Besides, we randomly mask some of the role alignments in the input text and force the model to recover the original role tags to complete the alignments. In addition, we introduce a topic segmentation mechanism to further improve the quality of the generated summaries. The experimental results show that our model is superior to the previous methods in meeting summary datasets AMI and ICSI.

improving abstractive dialogue تحسين الحوار الجماعي صناعة حمض الفوسفور

EASE: Extractive-Abstractive Summarization End-to-End using the Information Bottleneck Principle

990 - Association for Computation Linguistics 2021 مقالة

Current abstractive summarization systems outperform their extractive counterparts, but their widespread adoption is inhibited by the inherent lack of interpretability. Extractive summarization systems, though interpretable, suffer from redundancy an d possible lack of coherence. To achieve the best of both worlds, we propose EASE, an extractive-abstractive framework that generates concise abstractive summaries that can be traced back to an extractive summary. Our framework can be applied to any evidence-based text generation problem and can accommodate various pretrained models in its simple architecture. We use the Information Bottleneck principle to jointly train the extraction and abstraction in an end-to-end fashion. Inspired by previous research that humans use a two-stage framework to summarize long documents (Jing and McKeown, 2000), our framework first extracts a pre-defined amount of evidence spans and then generates a summary using only the evidence. Using automatic and human evaluations, we show that the generated summaries are better than strong extractive and extractive-abstractive baselines.

information bottleneck principle information bottleneck bottleneck principle معلومات عنق الزجاجة اختناق المعلومات مبدأ الاختناق صناعة حمض الفوسفور المزيد..

What's in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization

649 - Association for Computation Linguistics 2021 مقالة

Summarization of clinical narratives is a long-standing research problem. Here, we introduce the task of hospital-course summarization. Given the documentation authored throughout a patient's hospitalization, generate a paragraph that tells the story of the patient admission. We construct an English, text-to-text dataset of 109,000 hospitalizations (2M source notes) and their corresponding summary proxy: the clinician-authored Brief Hospital Course'' paragraph written as part of a discharge note. Exploratory analyses reveal that the BHC paragraphs are highly abstractive with some long extracted fragments; are concise yet comprehensive; differ in style and content organization from the source notes; exhibit minimal lexical cohesion; and represent silver-standard references. Our analysis identifies multiple implications for modeling this complex, multi-document summarization task.

groundwork for advances hospital-course summarization الأساس للتقدم تلخيص المستشفى بالطبع صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

StreamHover: Livestream Transcript Summarization and Annotation

Streamhover: ملخص ترتيب Livestream والشروح

Ask ChatGPT about the research

Read More

suggested questions