Subscribe to the gold package and get unlimited access to Shamra Academy

BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

قد يكون بيرت مبالا: رابط كيان طلي طبيعي صغير ولكن فعال يستند إلى الشبكات العصبية التفافية المتبقية

382 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

tiny but effective entity linker based effective biomedical entity صغيرة ولكنها فعالة رابط كيان مقره الكيان الطبي الحيوي فعال صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

إن ربط الكيانات الطبية الحيوية هي مهمة ربط الكيان في وثيقة طبيب طبيعية إلى كيانات مرجعية في قاعدة المعرفة. في الآونة الأخيرة، تم تقديم العديد من النماذج القائمة على بيرت للمهمة. في حين أن هذه النماذج تحقق نتائج تنافسية على العديد من مجموعات البيانات، فإنها باهظة الثمن بشكل حسابي وتحتوي على حوالي 110 مليون معلمة. لا يعرف القليل عن العوامل التي تساهم في أدائها المثيرة للإعجاب وما إذا كانت هناك حاجة إلى المعلمة الإفراطية. في هذا العمل، ألقينا بعض الضوء على الأعمال الداخلية لهذه النماذج الكبيرة القائمة على بيرت. من خلال مجموعة من التجارب التحقيق، وجدنا أن كيان يربط الأداء يتغير فقط قليلا عند خلط ترتيب كلمات الإدخال أو عندما يقتصر نطاق الانتباه على حجم نافذة ثابتة. من هذه الملاحظات، نقترح شبكة عصبية نفعية فعالة مع وصلات متبقية لربط الكيانات الطبية الحيوية. نظرا لخصائص التوصيلية المتناثرة وتقاسم الوزن، يحتوي نموذجنا على عدد صغير من المعلمات وهو فعال للغاية. على خمسة مجموعات بيانات عامة، يحقق نموذجنا القابل للمقارنة أو حتى أفضل ربط بدقة من النماذج القائمة على بيرت من أحدث المعلمات التي تضم حوالي 60 مرة معايير أقل من 60 مرة.

Biomedical entity linking is the task of linking entity mentions in a biomedical document to referent entities in a knowledge base. Recently, many BERT-based models have been introduced for the task. While these models achieve competitive results on many datasets, they are computationally expensive and contain about 110M parameters. Little is known about the factors contributing to their impressive performance and whether the over-parameterization is needed. In this work, we shed some light on the inner workings of these large BERT-based models. Through a set of probing experiments, we have found that the entity linking performance only changes slightly when the input word order is shuffled or when the attention scope is limited to a fixed window size. From these observations, we propose an efficient convolutional neural network with residual connections for biomedical entity linking. Because of the sparse connectivity and weight sharing properties, our model has a small number of parameters and is highly efficient. On five public datasets, our model achieves comparable or even better linking accuracy than the state-of-the-art BERT-based models while having about 60 times fewer parameters.

References used

https://aclanthology.org/

rate research

Self-Attention Graph Residual Convolutional Networks for Event Detection with dependency relations

659 - Association for Computation Linguistics 2021 مقالة

Event detection (ED) task aims to classify events by identifying key event trigger words embedded in a piece of text. Previous research have proved the validity of fusing syntactic dependency relations into Graph Convolutional Networks(GCN). While ex isting GCN-based methods explore latent node-to-node dependency relations according to a stationary adjacency tensor, an attention-based dynamic tensor, which can pay much attention to the key node like event trigger or its neighboring nodes, has not been developed. Simultaneously, suffering from the phenomenon of graph information vanishing caused by the symmetric adjacency tensor, existing GCN models can not achieve higher overall performance. In this paper, we propose a novel model Self-Attention Graph Residual Convolution Networks (SA-GRCN) to mine node-to-node latent dependency relations via self-attention mechanism and introduce Graph Residual Network (GResNet) to solve graph information vanishing problem. Specifically, a self-attention module is constructed to generate an attention tensor, representing the dependency attention scores of all words in the sentence. Furthermore, a graph residual term is added to the baseline SA-GCN to construct a GResNet. Considering the syntactically connection of the network input, we initialize the raw adjacency tensor without processed by the self-attention module as the residual term. We conduct experiments on the ACE2005 dataset and the results show significant improvement over competitive baseline methods.

اتخاذ القرارات السياسية graph residual convolutional residual convolutional networks الرسوم البيانية التفاضلية المتبقية الشبكات التفافية المتبقية صناعة حمض الفوسفور

Selective Attention Based Graph Convolutional Networks for Aspect-Level Sentiment Classification

925 - Association for Computation Linguistics 2021 مقالة

Recent work on aspect-level sentiment classification has employed Graph Convolutional Networks (GCN) over dependency trees to learn interactions between aspect terms and opinion words. In some cases, the corresponding opinion words for an aspect term cannot be reached within two hops on dependency trees, which requires more GCN layers to model. However, GCNs often achieve the best performance with two layers, and deeper GCNs do not bring any additional gain. Therefore, we design a novel selective attention based GCN model. On one hand, the proposed model enables the direct interaction between aspect terms and context words via the self-attention operation without the distance limitation on dependency trees. On the other hand, a top-k selection procedure is designed to locate opinion words by selecting k context words with the highest attention scores. We conduct experiments on several commonly used benchmark datasets and the results show that our proposed SA-GCN outperforms strong baseline models.

الموضوع والفعل aspect-level sentiment classification based graph convolutional تصنيف المعنويات على مستوى الجانب الرسم البياني على أساس صناعة حمض الفوسفور

Private Text Classification with Convolutional Neural Networks

1206 - Association for Computation Linguistics 2021 مقالة

Text classifiers are regularly applied to personal texts, leaving users of these classifiers vulnerable to privacy breaches. We propose a solution for privacy-preserving text classification that is based on Convolutional Neural Networks (CNNs) and Se cure Multiparty Computation (MPC). Our method enables the inference of a class label for a personal text in such a way that (1) the owner of the personal text does not have to disclose their text to anyone in an unencrypted manner, and (2) the owner of the text classifier does not have to reveal the trained model parameters to the text owner or to anyone else. To demonstrate the feasibility of our protocol for practical private text classification, we implemented it in the PyTorch-based MPC framework CrypTen, using a well-known additive secret sharing scheme in the honest-but-curious setting. We test the runtime of our privacy-preserving text classifier, which is fast enough to be used in practice.

معلومات شخصية صناعة حمض الفوسفور

Rule-based Morphological Inflection Improves Neural Terminology Translation

774 - Association for Computation Linguistics 2021 مقالة

Current approaches to incorporating terminology constraints in machine translation (MT) typically assume that the constraint terms are provided in their correct morphological forms. This limits their application to real-world scenarios where constrai nt terms are provided as lemmas. In this paper, we introduce a modular framework for incorporating lemma constraints in neural MT (NMT) in which linguistic knowledge and diverse types of NMT models can be flexibly applied. It is based on a novel cross-lingual inflection module that inflects the target lemma constraints based on the source context. We explore linguistically motivated rule-based and data-driven neural-based inflection modules and design English-German health and English-Lithuanian news test suites to evaluate them in domain adaptation and low-resource MT settings. Results show that our rule-based inflection module helps NMT models incorporate lemma constraints more accurately than a neural module and outperforms the existing end-to-end approach with lower training costs.

improves neural terminology morphological inflection improves neural terminology translation يحسن المصطلحات العصبية انعطاف مورفولوجي يحسن ترجمة المصطلحات العصبية صناعة حمض الفوسفور المزيد..

A ResNet-50-Based Convolutional Neural Network Model for Language ID Identification from Speech Recordings

733 - Association for Computation Linguistics 2021 مقالة

This paper describes the model built for the SIGTYP 2021 Shared Task aimed at identifying 18 typologically different languages from speech recordings. Mel-frequency cepstral coefficients derived from audio files are transformed into spectrograms, whi ch are then fed into a ResNet-50-based CNN architecture. The final model achieved validation and test accuracies of 0.73 and 0.53, respectively.

خطاب آلي متعدد اللغات neural network model convolutional neural التنافيل الشبكة العصبية نموذج الشبكة العصبية التنافيل العصبي صناعة حمض الفوسفور المزيد..

BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

قد يكون بيرت مبالا: رابط كيان طلي طبيعي صغير ولكن فعال يستند إلى الشبكات العصبية التفافية المتبقية

Ask ChatGPT about the research

Read More

suggested questions