Research papers, master and doctoral theses about biomedical

Findings of the WMT 2021 Biomedical Translation Shared Task: Summaries of Animal Experiments as New Test Set

891 - Association for Computation Linguistics 2021 مقالة

In the sixth edition of the WMT Biomedical Task, we addressed a total of eight language pairs, namely English/German, English/French, English/Spanish, English/Portuguese, English/Chinese, English/Russian, English/Italian, and English/Basque. Further, our tests were composed of three types of textual test sets. New to this year, we released a test set of summaries of animal experiments, in addition to the test sets of scientific abstracts and terminologies. We received a total of 107 submissions from 15 teams from 6 countries.

شارك WMT. biomedical translation shared wmt biomedical task مشاركتها الطبية الحيوية WMT المهمة الطبية الحيوية صناعة حمض الفوسفور

The Fujitsu DMATH Submissions for WMT21 News Translation and Biomedical Translation Tasks

648 - Association for Computation Linguistics 2021 مقالة

This paper describes the Fujitsu DMATH systems used for WMT 2021 News Translation and Biomedical Translation tasks. We focused on low-resource pairs, using a simple system. We conducted experiments on English-Hausa, Xhosa-Zulu and English-Basque, and submitted the results for Xhosa→Zulu in the News Translation Task, and English→Basque in the Biomedical Translation Task, abstract and terminology translation subtasks. Our system combines BPE dropout, sub-subword features and back-translation with a Transformer (base) model, achieving good results on the evaluation sets.

biomedical translation tasks fujitsu dmath submissions biomedical translation مهام الترجمة الطبية الحيوية fujitsu dmath التقديمات الترجمة الطبية الحيوية صناعة حمض الفوسفور المزيد..

Biomedical Concept Normalization by Leveraging Hypernyms

605 - Association for Computation Linguistics 2021 مقالة

Biomedical Concept Normalization (BCN) is widely used in biomedical text processing as a fundamental module. Owing to numerous surface variants of biomedical concepts, BCN still remains challenging and unsolved. In this paper, we exploit biomedical c oncept hypernyms to facilitate BCN. We propose Biomedical Concept Normalizer with Hypernyms (BCNH), a novel framework that adopts list-wise training to make use of both hypernyms and synonyms, and also employs norm constraint on the representation of hypernym-hyponym entity pairs. The experimental results show that BCNH outperforms the previous state-of-the-art model on the NCBI dataset.

biomedical concept normalization concept normalization biomedical concept تطبيع مفهوم الطب الحيوي التطبيع المفهوم المفهوم الطبي الطبيعي صناعة حمض الفوسفور المزيد..

Sent2Span: Span Detection for PICO Extraction in the Biomedical Text without Span Annotations

723 - Association for Computation Linguistics 2021 مقالة

The rapid growth in published clinical trials makes it difficult to maintain up-to-date systematic reviews, which require finding all relevant trials. This leads to policy and practice decisions based on out-of-date, incomplete, and biased subsets of available clinical evidence. Extracting and then normalising Population, Intervention, Comparator, and Outcome (PICO) information from clinical trial articles may be an effective way to automatically assign trials to systematic reviews and avoid searching and screening---the two most time-consuming systematic review processes. We propose and test a novel approach to PICO span detection. The major difference between our proposed method and previous approaches comes from detecting spans without needing annotated span data and using only crowdsourced sentence-level annotations. Experiments on two datasets show that PICO span detection results achieve much higher results for recall when compared to fully supervised methods with PICO sentence detection at least as good as human annotations. By removing the reliance on expert annotations for span detection, this work could be used in a human-machine pipeline for turning low-quality, crowdsourced, and sentence-level PICO annotations into structured information that can be used to quickly assign trials to relevant systematic reviews.

biomedical text pico extraction pico span detection النص الطبي الطبيعي بيكو استخراج كشف بيكو سبان صناعة حمض الفوسفور المزيد..

Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT

721 - Association for Computation Linguistics 2021 مقالة

Infusing factual knowledge into pre-trained models is fundamental for many knowledge-intensive tasks. In this paper, we proposed Mixture-of-Partitions (MoP), an infusion approach that can handle a very large knowledge graph (KG) by partitioning it in to smaller sub-graphs and infusing their specific knowledge into various BERT models using lightweight adapters. To leverage the overall factual knowledge for a target task, these sub-graph adapters are further fine-tuned along with the underlying BERT through a mixture layer. We evaluate our MoP with three biomedical BERTs (SciBERT, BioBERT, PubmedBERT) on six downstream tasks (inc. NLI, QA, Classification), and the results show that our MoP consistently enhances the underlying BERTs in task performance, and achieves new SOTA performances on five evaluated datasets.

large knowledge graph biomedical knowledge graphs infusing large biomedical الرسم البياني المعرفة الكبيرة رسوم المعرفة الطبية الحيوية الرسوم البيانية غرس Biomedical كبيرة صناعة حمض الفوسفور المزيد..

What Would it Take to get Biomedical QA Systems into Practice?

1151 - Association for Computation Linguistics 2021 مقالة

Medical question answering (QA) systems have the potential to answer clinicians' uncertainties about treatment and diagnosis on-demand, informed by the latest evidence. However, despite the significant progress in general QA made by the NLP community , medical QA systems are still not widely used in clinical environments. One likely reason for this is that clinicians may not readily trust QA system outputs, in part because transparency, trustworthiness, and provenance have not been key considerations in the design of such models. In this paper we discuss a set of criteria that, if met, we argue would likely increase the utility of biomedical QA systems, which may in turn lead to adoption of such systems in practice. We assess existing models, tasks, and datasets with respect to these criteria, highlighting shortcomings of previously proposed approaches and pointing toward what might be more usable QA systems.

MFAQ. medical question answering biomedical qa systems السؤال الطبي الرد أنظمة ضمان الجودة الطبية الحيوية صناعة حمض الفوسفور

FJWU Participation for the WMT21 Biomedical Translation Task

793 - Association for Computation Linguistics 2021 مقالة

In this paper we present the FJWU's system submitted to the biomedical shared task at WMT21. We prepared state-of-the-art multilingual neural machine translation systems for three languages (i.e. German, Spanish and French) with English as target lan guage. Our NMT systems based on Transformer architecture, were trained on combination of in-domain and out-domain parallel corpora developed using Information Retrieval (IR) and domain adaptation techniques.

fjwu participation biomedical translation task biomedical shared task مشاركة FJWU. مهمة الترجمة الطبية الحيوية المهمة المشتركة الطبية الحيوية صناعة حمض الفوسفور المزيد..

BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

380 - Association for Computation Linguistics 2021 مقالة

Biomedical entity linking is the task of linking entity mentions in a biomedical document to referent entities in a knowledge base. Recently, many BERT-based models have been introduced for the task. While these models achieve competitive results on many datasets, they are computationally expensive and contain about 110M parameters. Little is known about the factors contributing to their impressive performance and whether the over-parameterization is needed. In this work, we shed some light on the inner workings of these large BERT-based models. Through a set of probing experiments, we have found that the entity linking performance only changes slightly when the input word order is shuffled or when the attention scope is limited to a fixed window size. From these observations, we propose an efficient convolutional neural network with residual connections for biomedical entity linking. Because of the sparse connectivity and weight sharing properties, our model has a small number of parameters and is highly efficient. On five public datasets, our model achieves comparable or even better linking accuracy than the state-of-the-art BERT-based models while having about 60 times fewer parameters.

tiny but effective entity linker based effective biomedical entity صغيرة ولكنها فعالة رابط كيان مقره الكيان الطبي الحيوي فعال صناعة حمض الفوسفور المزيد..

Relation Extraction Using Multiple Pre-Training Models in Biomedical Domain

780 - Association for Computation Linguistics 2021 مقالة

The number of biomedical documents is increasing rapidly. Accordingly, a demand for extracting knowledge from large-scale biomedical texts is also increasing. BERT-based models are known for their high performance in various tasks. However, it is oft en computationally expensive. A high-end GPU environment is not available in many situations. To attain both high accuracy and fast extraction speed, we propose combinations of simpler pre-trained models. Our method outperforms the latest state-of-the-art model and BERT-based models on the GAD corpus. In addition, our method shows approximately three times faster extraction speed than the BERT-based models on the ChemProt corpus and reduces the memory size to one sixth of the BERT ones.

multiple pre-training models multiple pre-training biomedical domain نماذج متعددة التدريب مسبقا متعددة ما قبل التدريب النطاق الطبي الطبيعي صناعة حمض الفوسفور المزيد..

Learning Entity-Likeness with Multiple Approximate Matches for Biomedical NER

667 - Association for Computation Linguistics 2021 مقالة

Biomedical Named Entities are complex, so approximate matching has been used to improve entity coverage. However, the usual approximate matching approach fetches only one matching result, which is often noisy. In this work, we propose a method for bi omedical NER that fetches multiple approximate matches for a given phrase to leverage their variations to estimate entity-likeness. The model uses pooling to discard the unnecessary information from the noisy matching results, and learn the entity-likeness of the phrase with multiple approximate matches. Experimental results on three benchmark datasets from the biomedical domain, BC2GM, NCBI-disease, and BC4CHEMD, demonstrate the effectiveness. Our model improves the average by up to +0.21 points compared to a BioBERT-based NER.

multiple approximate matches biomedical named entities approximate matches مباريات تقريبية متعددة الكيانات المسماة الطبية الحيوية المباريات التقريبية صناعة حمض الفوسفور المزيد..