New community

Subscribe to the gold package and get unlimited access to Shamra Academy

SpanPredict: Extraction of Predictive Document Spans with Neural Attention

spancepredict: استخراج وثيقة تنبؤية يمتد بالاهتمام العصبي

221 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

neural attention identifying predictive neural الاهتمام العصبي تحديد التنبؤ العصبي صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In many natural language processing applications, identifying predictive text can be as important as the predictions themselves. When predicting medical diagnoses, for example, identifying predictive content in clinical notes not only enhances interpretability, but also allows unknown, descriptive (i.e., text-based) risk factors to be identified. We here formalize this problem as predictive extraction and address it using a simple mechanism based on linear attention. Our method preserves differentiability, allowing scalable inference via stochastic gradient descent. Further, the model decomposes predictions into a sum of contributions of distinct text spans. Importantly, we require only document labels, not ground-truth spans. Results show that our model identifies semantically-cohesive spans and assigns them scores that agree with human ratings, while preserving classification performance.

References used

https://aclanthology.org/

rate research

Extracting Appointment Spans from Medical Conversations

258 - Association for Computation Linguistics 2021 مقالة

Extracting structured information from medical conversations can reduce the documentation burden for doctors and help patients follow through with their care plan. In this paper, we introduce a novel task of extracting appointment spans from medical conversations. We frame this task as a sequence tagging problem and focus on extracting spans for appointment reason and time. However, annotating medical conversations is expensive, time-consuming, and requires considerable domain expertise. Hence, we propose to leverage weak supervision approaches, namely incomplete supervision, inaccurate supervision, and a hybrid supervision approach and evaluate both generic and domain-specific, ELMo, and BERT embeddings using sequence tagging models. The best performing model is the domain-specific BERT variant using weak hybrid supervision and obtains an F1 score of 79.32.

medical conversations extracting appointment spans annotating medical conversations المحادثات الطبية استخراج موعد يمتد التسجيل المحادثات الطبية صناعة حمض الفوسفور المزيد..

Contrastive Document Representation Learning with Graph Attention Networks

260 - Association for Computation Linguistics 2021 مقالة

Recent progress in pretrained Transformer-based language models has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention network on top of the available pretrained Transformers model to learn document embeddings. This graph attention network allows us to leverage the high-level semantic structure of the document. In addition, based on our graph document model, we design a simple contrastive learning strategy to pretrain our models on a large amount of unlabeled corpus. Empirically, we demonstrate the effectiveness of our approaches in document classification and document retrieval tasks.

learning contextual representation document representation learning تعلم التمثيل السياقي تمثيل الوثائق التعلم صناعة حمض الفوسفور

Neural Attention-Aware Hierarchical Topic Model

280 - Association for Computation Linguistics 2021 مقالة

Neural topic models (NTMs) apply deep neural networks to topic modelling. Despite their success, NTMs generally ignore two important aspects: (1) only document-level word count information is utilized for the training, while more fine-grained sentenc e-level information is ignored, and (2) external semantic knowledge regarding documents, sentences and words are not exploited for the training. To address these issues, we propose a variational autoencoder (VAE) NTM model that jointly reconstructs the sentence and document word counts using combinations of bag-of-words (BoW) topical embeddings and pre-trained semantic embeddings. The pre-trained embeddings are first transformed into a common latent topical space to align their semantics with the BoW embeddings. Our model also features hierarchical KL divergence to leverage embeddings of each document to regularize those of their sentences, paying more attention to semantically relevant sentences. Both quantitative and qualitative experiments have shown the efficacy of our model in 1) lowering the reconstruction errors at both the sentence and document levels, and 2) discovering more coherent topics from real-world datasets.

attention-aware hierarchical topic neural attention-aware hierarchical الانتباه تدرك موضوع هرمي الاهتمام العصبي يدرك التسلسل الهرمي صناعة حمض الفوسفور

A Review on Document Information Extraction Approaches

818 - Association for Computation Linguistics 2021 مقالة

Information extraction from documents has become great use of novel natural language processing areas. Most of the entity extraction methodologies are variant in a context such as medical area, financial area, also come even limited to the given lang uage. It is better to have one generic approach applicable for any document type to extract entity information regardless of language, context, and structure. Also, another issue in such research is structural analysis while keeping the hierarchical, semantic, and heuristic features. Another problem identified is that usually, it requires a massive training corpus. Therefore, this research focus on mitigating such barriers. Several approaches have been identifying towards building document information extractors focusing on different disciplines. This research area involves natural language processing, semantic analysis, information extraction, and conceptual modelling. This paper presents a review of the information extraction mechanism to construct a generic framework for document extraction with aim of providing a solid base for upcoming research.

نصوص وسائل الإعلام استخراج المعلومات صناعة حمض الفوسفور

Extending Neural Keyword Extraction with TF-IDF tagset matching

307 - Association for Computation Linguistics 2021 مقالة

Keyword extraction is the task of identifying words (or multi-word expressions) that best describe a given document and serve in news portals to link articles of similar topics. In this work, we develop and evaluate our methods on four novel data set s covering less-represented, morphologically-rich languages in European news media industry (Croatian, Estonian, Latvian, and Russian). First, we perform evaluation of two supervised neural transformer-based methods, Transformer-based Neural Tagger for Keyword Identification (TNT-KID) and Bidirectional Encoder Representations from Transformers (BERT) with an additional Bidirectional Long Short-Term Memory Conditional Random Fields (BiLSTM CRF) classification head, and compare them to a baseline Term Frequency - Inverse Document Frequency (TF-IDF) based unsupervised approach. Next, we show that by combining the keywords retrieved by both neural transformer-based methods and extending the final set of keywords with an unsupervised TF-IDF based technique, we can drastically improve the recall of the system, making it appropriate for usage as a recommendation system in the media house environment.

tf-idf tagset matching tagset matching neural keyword extraction TF-IDF التغليف مطابقة تاجي مطابقة استخلاص الكلمات الرئيسية العصبية صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

SpanPredict: Extraction of Predictive Document Spans with Neural Attention

spancepredict: استخراج وثيقة تنبؤية يمتد بالاهتمام العصبي

Ask ChatGPT about the research

Read More

suggested questions