Understanding Clinical Trial Reports: Extracting Medical Entities and Their Relations

125 0 0.0 ( 0 )

Download Cite

Added by Benjamin Nye

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Benjamin E. Nye - Jay DeYoung - Eric Lehman

Computation and Language

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The best evidence concerning comparative treatment effectiveness comes from clinical trials, the results of which are reported in unstructured articles. Medical experts must manually extract information from articles to inform decision-making, which is time-consuming and expensive. Here we consider the end-to-end task of both (a) extracting treatments and outcomes from full-text articles describing clinical trials (entity identification) and, (b) inferring the reported results for the former with respect to the latter (relation extraction). We introduce new data for this task, and evaluate models that have recently achieved state-of-the-art results on similar tasks in Natural Language Processing. We then propose a new method motivated by how trial results are typically presented that outperforms these purely data-driven baselines. Finally, we run a fielded evaluation of the model with a non-profit seeking to identify existing drugs that might be re-purposed for cancer, showing the potential utility of end-to-end evidence extraction systems.

rate research

RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

110 - Saahil Jain , Ashwin Agrawal , Adriel Saporta 2021

Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed to structure radiology reports. We release a development dataset, which contains board-certified radiologist annotations for 500 radiology reports from the MIMIC-CXR dataset (14,579 entities and 10,889 relations), and a test dataset, which contains two independent sets of board-certified radiologist annotations for 100 radiology reports split equally across the MIMIC-CXR and CheXpert datasets. Using these datasets, we train and test a deep learning model, RadGraph Benchmark, that achieves a micro F1 of 0.82 and 0.73 on relation extraction on the MIMIC-CXR and CheXpert test sets respectively. Additionally, we release an inference dataset, which contains annotations automatically generated by RadGraph Benchmark across 220,763 MIMIC-CXR reports (around 6 million entities and 4 million relations) and 500 CheXpert reports (13,783 entities and 9,908 relations) with mappings to associated chest radiographs. Our freely available dataset can facilitate a wide range of research in medical natural language processing, as well as computer vision and multi-modal learning when linked to chest radiographs.

Computation and Language Artificial Intelligence Information Retrieval

Learning to Infer Entities, Properties and their Relations from Clinical Conversations

71 - Nan Du , Mingqiu Wang , Linh Tran 2019

Recently we proposed the Span Attribute Tagging (SAT) Model (Du et al., 2019) to infer clinical entities (e.g., symptoms) and their properties (e.g., duration). It tackles the challenge of large label space and limited training data using a hierarchical two-stage approach that identifies the span of interest in a tagging step and assigns labels to the span in a classification step. We extend the SAT model to jointly infer not only entities and their properties but also relations between them. Most relation extraction models restrict inferring relations between tokens within a few neighboring sentences, mainly to avoid high computational complexity. In contrast, our proposed Relation-SAT (R-SAT) model is computationally efficient and can infer relations over the entire conversation, spanning an average duration of 10 minutes. We evaluate our model on a corpus of clinical conversations. When the entities are given, the R-SAT outperforms baselines in identifying relations between symptoms and their properties by about 32% (0.82 vs 0.62 F-score) and by about 50% (0.60 vs 0.41 F-score) on medications and their properties. On the more difficult task of jointly inferring entities and relations, the R-SAT model achieves a performance of 0.34 and 0.45 for symptoms and medications respectively, which is significantly better than 0.18 and 0.35 for the baseline model. The contributions of different components of the model are quantified using ablation analysis.

Computation and Language

Extracting Symptoms and their Status from Clinical Conversations

68 - Nan Du , Kai Chen , Anjuli Kannan 2019

This paper describes novel models tailored for a new application, that of extracting the symptoms mentioned in clinical conversations along with their status. Lack of any publicly available corpus in this privacy-sensitive domain led us to develop our own corpus, consisting of about 3K conversations annotated by professional medical scribes. We propose two novel deep learning approaches to infer the symptom names and their status: (1) a new hierarchical span-attribute tagging (SAT) model, trained using curriculum learning, and (2) a variant of sequence-to-sequence model which decodes the symptoms and their status from a few speaker turns within a sliding window over the conversation. This task stems from a realistic application of assisting medical providers in capturing symptoms mentioned by patients from their clinical conversations. To reflect this application, we define multiple metrics. From inter-rater agreement, we find that the task is inherently difficult. We conduct comprehensive evaluations on several contrasting conditions and observe that the performance of the models range from an F-score of 0.5 to 0.8 depending on the condition. Our analysis not only reveals the inherent challenges of the task, but also provides useful directions to improve the models.

Computation and Language Machine Learning

WNUT-2020 Task 1 Overview: Extracting Entities and Relations from Wet Lab Protocols

144 - Jeniya Tabassum , Sydney Lee , Wei Xu 2020

This paper presents the results of the wet lab information extraction task at WNUT 2020. This task consisted of two sub tasks: (1) a Named Entity Recognition (NER) task with 13 participants and (2) a Relation Extraction (RE) task with 2 participants. We outline the task, data annotation process, corpus statistics, and provide a high-level overview of the participating systems for each sub task.

Computation and Language

Using public clinical trial reports to evaluate observational study methods

215 - Ethan Steinberg , Steve Yadlowsky , Yizhe Xu 2020

Observational studies are valuable for estimating the effects of various medical interventions, but are notoriously difficult to evaluate because the methods used in observational studies require many untestable assumptions. This lack of verifiability makes it difficult both to compare different observational study methods and to trust the results of any particular observational study. In this work, we propose TrialVerify, a new approach for evaluating observational study methods based on ground truth sourced from clinical trial reports. We process trial reports into a denoised collection of known causal relationships that can then be used to estimate the precision and recall of various observational study methods. We then use TrialVerify to evaluate multiple observational study methods in terms of their ability to identify the known causal relationships from a large national insurance claims dataset. We found that inverse propensity score weighting is an effective approach for accurately reproducing known causal relationships and outperforms other observational study methods. TrialVerify is made freely available for others to evaluate observational study methods.

Applications