ﻻ يوجد ملخص باللغة العربية
Transcripts generated by automatic speech recognition (ASR) systems for spoken documents lack structural annotations such as paragraphs, significantly reducing their readability. Automatically predicting paragraph segmentation for spoken documents may both improve readability and downstream NLP performance such as summarization and machine reading comprehension. We propose a sequence model with self-adaptive sliding window for accurate and efficient paragraph segmentation. We also propose an approach to exploit phonetic information, which significantly improves robustness of spoken document segmentation to ASR errors. Evaluations are conducted on the English Wiki-727K document segmentation benchmark, a Chinese Wikipedia-based document segmentation dataset we created, and an in-house Chinese spoken document dataset. Our proposed model outperforms the state-of-the-art (SOTA) model based on the same BERT-Base, increasing segmentation F1 on the English benchmark by 4.2 points and on Chinese datasets by 4.3-10.1 points, while reducing inference time to less than 1/6 of inference time of the current SOTA.
Recently abstractive spoken language summarization raises emerging research interest, and neural sequence-to-sequence approaches have brought significant performance improvement. However, summarizing long meeting transcripts remains challenging. Due
Sequence labeling is an important technique employed for many Natural Language Processing (NLP) tasks, such as Named Entity Recognition (NER), slot tagging for dialog systems and semantic parsing. Large-scale pre-trained language models obtain very g
This work proposes a novel adaptation of a pretrained sequence-to-sequence model to the task of document ranking. Our approach is fundamentally different from a commonly-adopted classification-based formulation of ranking, based on encoder-only pretr
The quadratic computational and memory complexities of large Transformers have limited their scalability for long document summarization. In this paper, we propose Hepos, a novel efficient encoder-decoder attention with head-wise positional strides t
The scheme of the sliding window is known in Information Theory, Computer Science, the problem of predicting and in stastistics. Let a source with unknown statistics generate some word $... x_{-1}x_{0}x_{1}x_{2}...$ in some alphabet $A$. For every mo