No Arabic abstract
Dietary supplements are used by a large portion of the population, but information on their pharmacologic interactions is incomplete. To address this challenge, we present SUPP.AI, an application for browsing evidence of supplement-drug interactions (SDIs) extracted from the biomedical literature. We train a model to automatically extract supplement information and identify such interactions from the scientific literature. To address the lack of labeled data for SDI identification, we use labels of the closely related task of identifying drug-drug interactions (DDIs) for supervision. We fine-tune the contextualized word representations of the RoBERTa language model using labeled DDI data, and apply the fine-tuned model to identify supplement interactions. We extract 195k evidence sentences from 22M articles (P=0.82, R=0.58, F1=0.68) for 60k interactions. We create the SUPP.AI application for users to search evidence sentences extracted by our model. SUPP.AI is an attempt to close the information gap on dietary supplements by making up-to-date evidence on SDIs more discoverable for researchers, clinicians, and consumers.
Off-the-shelf biomedical embeddings obtained from the recently released various pre-trained language models (such as BERT, XLNET) have demonstrated state-of-the-art results (in terms of accuracy) for the various natural language understanding tasks (NLU) in the biomedical domain. Relation Classification (RC) falls into one of the most critical tasks. In this paper, we explore how to incorporate domain knowledge of the biomedical entities (such as drug, disease, genes), obtained from Knowledge Graph (KG) Embeddings, for predicting Drug-Drug Interaction from textual corpus. We propose a new method, BERTKG-DDI, to combine drug embeddings obtained from its interaction with other biomedical entities along with domain-specific BioBERT embedding-based RC architecture. Experiments conducted on the DDIExtraction 2013 corpus clearly indicate that this strategy improves other baselines architectures by 4.1% macro F1-score.
OBJECTIVE: Leverage existing biomedical NLP tools and DS domain terminology to produce a novel and comprehensive knowledge graph containing dietary supplement (DS) information for discovering interactions between DS and drugs, or Drug-Supplement Interactions (DSI). MATERIALS AND METHODS: We created SemRepDS (an extension of SemRep), capable of extracting semantic relations from abstracts by leveraging a DS-specific terminology (iDISK) containing 28,884 DS terms not found in the UMLS. PubMed abstracts were processed using SemRepDS to generate semantic relations, which were then filtered using a PubMedBERT-based model to remove incorrect relations before generating our knowledge graph (SuppKG). Two pathways are used to identify potential DS-Drug interactions which are then evaluated by medical professionals for mechanistic plausibility. RESULTS: Comparison analysis found that SemRepDS returned 206.9% more DS relations and 158.5% more DS entities than SemRep. The fine-tuned BERT model obtained an F1 score of 0.8605 and removed 43.86% of the relations, improving the precision of the relations by 26.4% compared to pre-filtering. SuppKG consists of 2,928 DS-specific nodes. Manual review of findings identified 44 (88%) proposed DS-Gene-Drug and 32 (64%) proposed DS-Gene1-Function-Gene2-Drug pathways to be mechanistically plausible. DISCUSSION: The additional relations extracted using SemRepDS generated SuppKG that was used to find plausible DSI not found in the current literature. By the nature of the SuppKG, these interactions are unlikely to have been found using SemRep without the expanded DS terminology. CONCLUSION: We successfully extend SemRep to include DS information and produce SuppKG which can be used to find potential DS-Drug interactions.
When patients need to take medicine, particularly taking more than one kind of drug simultaneously, they should be alarmed that there possibly exists drug-drug interaction. Interaction between drugs may have a negative impact on patients or even cause death. Generally, drugs that conflict with a specific drug (or label drug) are usually described in its drug label or package insert. Since more and more new drug products come into the market, it is difficult to collect such information by manual. We take part in the Drug-Drug Interaction (DDI) Extraction from Drug Labels challenge of Text Analysis Conference (TAC) 2018, choosing task1 and task2 to automatically extract DDI related mentions and DDI relations respectively. Instead of regarding task1 as named entity recognition (NER) task and regarding task2 as relation extraction (RE) task then solving it in a pipeline, we propose a two step joint model to detect DDI and its related mentions jointly. A sequence tagging system (CNN-GRU encoder-decoder) finds precipitants first and search its fine-grained Trigger and determine the DDI for each precipitant in the second step. Moreover, a rule based model is built to determine the sub-type for pharmacokinetic interation. Our system achieved best result in both task1 and task2. F-measure reaches 0.46 in task1 and 0.40 in task2.
Preventable adverse drug reactions as a result of medical errors present a growing concern in modern medicine. As drug-drug interactions (DDIs) may cause adverse reactions, being able to extracting DDIs from drug labels into machine-readable form is an important effort in effectively deploying drug safety information. The DDI track of TAC 2018 introduces two large hand-annotated test sets for the task of extracting DDIs from structured product labels with linkage to standard terminologies. Herein, we describe our approach to tackling tasks one and two of the DDI track, which corresponds to named entity recognition (NER) and sentence-level relation extraction respectively. Namely, our approach resembles a multi-task learning framework designed to jointly model various sub-tasks including NER and interaction type and outcome prediction. On NER, our system ranked second (among eight teams) at 33.00% and 38.25% F1 on Test Sets 1 and 2 respectively. On relation extraction, our system ranked second (among four teams) at 21.59% and 23.55% on Test Sets 1 and 2 respectively.
Preventable adverse events as a result of medical errors present a growing concern in the healthcare system. As drug-drug interactions (DDIs) may lead to preventable adverse events, being able to extract DDIs from drug labels into a machine-processable form is an important step toward effective dissemination of drug safety information. In this study, we tackle the problem of jointly extracting drugs and their interactions, including interaction outcome, from drug labels. Our deep learning approach entails composing various intermediate representations including sequence and graph based context, where the latter is derived using graph convolutions (GC) with a novel attention-based gating mechanism (holistically called GCA). These representations are then composed in meaningful ways to handle all subtasks jointly. To overcome scarcity in training data, we additionally propose transfer learning by pre-training on related DDI data. Our model is trained and evaluated on the 2018 TAC DDI corpus. Our GCA model in conjunction with transfer learning performs at 39.20% F1 and 26.09% F1 on entity recognition (ER) and relation extraction (RE) respectively on the first official test set and at 45.30% F1 and 27.87% F1 on ER and RE respectively on the second official test set corresponding to an improvement over our prior best results by up to 6 absolute F1 points. After controlling for available training data, our model exhibits state-of-the-art performance by improving over the next comparable best outcome by roughly three F1 points in ER and 1.5 F1 points in RE evaluation across two official test sets.