New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Incorporating medical knowledge in BERT for clinical relation extraction

دمج المعرفة الطبية في بيرت لاستخراج العلاقات السريرية

264 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In recent years pre-trained language models (PLM) such as BERT have proven to be very effective in diverse NLP tasks such as Information Extraction, Sentiment Analysis and Question Answering. Trained with massive general-domain text, these pre-trained language models capture rich syntactic, semantic and discourse information in the text. However, due to the differences between general and specific domain text (e.g., Wikipedia versus clinic notes), these models may not be ideal for domain-specific tasks (e.g., extracting clinical relations). Furthermore, it may require additional medical knowledge to understand clinical text properly. To solve these issues, in this research, we conduct a comprehensive examination of different techniques to add medical knowledge into a pre-trained BERT model for clinical relation extraction. Our best model outperforms the state-of-the-art systems on the benchmark i2b2/VA 2010 clinical relation extraction dataset.

References used

https://aclanthology.org/

rate research

Incorporating Domain Knowledge into Language Transformers for Multi-Label Classification of Chinese Medical Questions

377 - Association for Computation Linguistics 2021 مقالة

In this paper, we propose a knowledge infusion mechanism to incorporate domain knowledge into language transformers. Weakly supervised data is regarded as the main source for knowledge acquisition. We pre-train the language models to capture masked k nowledge of focuses and aspects and then fine-tune them to obtain better performance on the downstream tasks. Due to the lack of publicly available datasets for multi-label classification of Chinese medical questions, we crawled questions from medical question/answer forums and manually annotated them using eight predefined classes: persons and organizations, symptom, cause, examination, disease, information, ingredient, and treatment. Finally, a total of 1,814 questions with 2,340 labels. Each question contains an average of 1.29 labels. We used Baidu Medical Encyclopedia as the knowledge resource. Two transformers BERT and RoBERTa were implemented to compare performance on our constructed datasets. Experimental results showed that our proposed model with knowledge infusion mechanism can achieve better performance, no matter which evaluation metric including Macro F1, Micro F1, Weighted F1 or Subset Accuracy were considered.

incorporating domain knowledge chinese medical questions classification of chinese دمج المعرفة المجال أسئلة طبية الصينية تصنيف الصينية صناعة حمض الفوسفور المزيد..

DyLex: Incorporating Dynamic Lexicons into BERT for Sequence Labeling

613 - Association for Computation Linguistics 2021 مقالة

Incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks. However, previous works commonly have difficulty dealing with large-scale dynamic lexicons which often cause excessive matchin g noise and problems of frequent updates. In this paper, we propose DyLex, a plug-in lexicon incorporation approach for BERT based sequence labeling tasks. Instead of leveraging embeddings of words in the lexicon as in conventional methods, we adopt word-agnostic tag embeddings to avoid re-training the representation while updating the lexicon. Moreover, we employ an effective supervised lexical knowledge denoising method to smooth out matching noise. Finally, we introduce a col-wise attention based knowledge fusion mechanism to guarantee the pluggability of the proposed framework. Experiments on ten datasets of three tasks show that the proposed framework achieves new SOTA, even with very large scale lexicons.

sequence labeling tasks incorporating dynamic lexicons مهام تسلسل وضع التسلسل دمج المعجم الديناميكي صناعة حمض الفوسفور

Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT

326 - Association for Computation Linguistics 2021 مقالة

Infusing factual knowledge into pre-trained models is fundamental for many knowledge-intensive tasks. In this paper, we proposed Mixture-of-Partitions (MoP), an infusion approach that can handle a very large knowledge graph (KG) by partitioning it in to smaller sub-graphs and infusing their specific knowledge into various BERT models using lightweight adapters. To leverage the overall factual knowledge for a target task, these sub-graph adapters are further fine-tuned along with the underlying BERT through a mixture layer. We evaluate our MoP with three biomedical BERTs (SciBERT, BioBERT, PubmedBERT) on six downstream tasks (inc. NLI, QA, Classification), and the results show that our MoP consistently enhances the underlying BERTs in task performance, and achieves new SOTA performances on five evaluated datasets.

large knowledge graph biomedical knowledge graphs infusing large biomedical الرسم البياني المعرفة الكبيرة رسوم المعرفة الطبية الحيوية الرسوم البيانية غرس Biomedical كبيرة صناعة حمض الفوسفور المزيد..

Incorporating External Knowledge to Enhance Tabular Reasoning

530 - Association for Computation Linguistics 2021 مقالة

Reasoning about tabular information presents unique challenges to modern NLP approaches which largely rely on pre-trained contextualized embeddings of text. In this paper, we study these challenges through the problem of tabular natural language infe rence. We propose easy and effective modifications to how information is presented to a model for this task. We show via systematic experiments that these strategies substantially improve tabular inference performance.

incorporating external knowledge external knowledge knowledge to enhance دمج المعرفة الخارجية المعرفة الخارجية المعرفة لتعزيز صناعة حمض الفوسفور المزيد..

Assertion Detection in Clinical Notes: Medical Language Models to the Rescue?

278 - Association for Computation Linguistics 2021 مقالة

In order to provide high-quality care, health professionals must efficiently identify the presence, possibility, or absence of symptoms, treatments and other relevant entities in free-text clinical notes. Such is the task of assertion detection - to identify the assertion class (present, possible, absent) of an entity based on textual cues in unstructured text. We evaluate state-of-the-art medical language models on the task and show that they outperform the baselines in all three classes. As transferability is especially important in the medical domain we further study how the best performing model behaves on unseen data from two other medical datasets. For this purpose we introduce a newly annotated set of 5,000 assertions for the publicly available MIMIC-III dataset. We conclude with an error analysis that reveals situations in which the models still go wrong and points towards future research directions.

free-text clinical notes clinical notes medical language models النص السريري ملاحظات السريرية نماذج اللغة الطبية صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Incorporating medical knowledge in BERT for clinical relation extraction

دمج المعرفة الطبية في بيرت لاستخراج العلاقات السريرية

Ask ChatGPT about the research

Read More

suggested questions