New community

Subscribe to the gold package and get unlimited access to Shamra Academy

From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment

من المحاذاة إلى المهمة: محاذاة كيان بسيطة غير مؤسس بسيطة

277 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Cross-lingual entity alignment (EA) aims to find the equivalent entities between crosslingual KGs (Knowledge Graphs), which is a crucial step for integrating KGs. Recently, many GNN-based EA methods are proposed and show decent performance improvements on several public datasets. However, existing GNN-based EA methods inevitably inherit poor interpretability and low efficiency from neural networks. Motivated by the isomorphic assumption of GNN-based methods, we successfully transform the cross-lingual EA problem into an assignment problem. Based on this re-definition, we propose a frustratingly Simple but Effective Unsupervised entity alignment method (SEU) without neural networks. Extensive experiments have been conducted to show that our proposed unsupervised approach even beats advanced supervised methods across all public datasets while having high efficiency, interpretability, and stability.

References used

https://aclanthology.org/

rate research

Simple Entity-Centric Questions Challenge Dense Retrievers

286 - Association for Computation Linguistics 2021 مقالة

Open-domain question answering has exploded in popularity recently due to the success of dense retrieval models, which have surpassed sparse models using only a few supervised training examples. However, in this paper, we demonstrate current dense mo dels are not yet the holy grail of retrieval. We first construct EntityQuestions, a set of simple, entity-rich questions based on facts from Wikidata (e.g., Where was Arve Furset born?''), and observe that dense retrievers drastically under-perform sparse methods. We investigate this issue and uncover that dense retrievers can only generalize to common entities unless the question pattern is explicitly observed during training. We discuss two simple solutions towards addressing this critical problem. First, we demonstrate that data augmentation is unable to fix the generalization problem. Second, we argue a more robust passage encoder helps facilitate better question adaptation using specialized question encoders. We hope our work can shed light on the challenges in creating a robust, universal dense retriever that works well across different input distributions.

سؤال متعدد البيانات entity-centric questions challenge أسئلة التركز على الكيان التحدي صناعة حمض الفوسفور

Self-Alignment Pretraining for Biomedical Entity Representations

586 - Association for Computation Linguistics 2021 مقالة

Despite the widespread success of self-supervised learning via masked language models (MLM), accurately capturing fine-grained semantic relationships in the biomedical domain remains a challenge. This is of paramount importance for entity-level tasks such as entity linking where the ability to model entity relations (especially synonymy) is pivotal. To address this challenge, we propose SapBERT, a pretraining scheme that self-aligns the representation space of biomedical entities. We design a scalable metric learning framework that can leverage UMLS, a massive collection of biomedical ontologies with 4M+ concepts. In contrast with previous pipeline-based hybrid systems, SapBERT offers an elegant one-model-for-all solution to the problem of medical entity linking (MEL), achieving a new state-of-the-art (SOTA) on six MEL benchmarking datasets. In the scientific domain, we achieve SOTA even without task-specific supervision. With substantial improvement over various domain-specific pretrained MLMs such as BioBERT, SciBERTand and PubMedBERT, our pretraining scheme proves to be both effective and robust.

self-alignment pretraining biomedical entity representations biomedical محاذاة المحاذاة تمثيل الكيانات الطبية الحيوية الطبية الحيوية صناعة حمض الفوسفور المزيد..

Unsupervised Domain Adaptation Method with Semantic-Structural Alignment for Dependency Parsing

695 - Association for Computation Linguistics 2021 مقالة

Unsupervised cross-domain dependency parsing is to accomplish domain adaptation for dependency parsing without using labeled data in target domain. Existing methods are often of the pseudo-annotation type, which generates data through self-annotation of the base model and performing iterative training. However, these methods fail to consider the change of model structure for domain adaptation. In addition, the structural information contained in the text cannot be fully exploited. To remedy these drawbacks, we propose a Semantics-Structure Adaptative Dependency Parser (SSADP), which accomplishes unsupervised cross-domain dependency parsing without relying on pseudo-annotation or data selection. In particular, we design two feature extractors to extract semantic and structural features respectively. For each type of features, a corresponding feature adaptation method is utilized to achieve domain adaptation to align the domain distribution, which effectively enhances the unsupervised cross-domain transfer capability of the model. We validate the effectiveness of our model by conducting experiments on the CODT1 and CTB9 respectively, and the results demonstrate that our model can achieve consistent performance improvement. Besides, we verify the structure transfer ability of the proposed model by introducing Weisfeiler-Lehman Test.

تصنيف التصنيف semantic-structural alignment المحاذاة الهيكلية الدلالية صناعة حمض الفوسفور

Frustratingly Simple but Surprisingly Strong: Using Language-Independent Features for Zero-shot Cross-lingual Semantic Parsing

360 - Association for Computation Linguistics 2021 مقالة

The availability of corpora has led to significant advances in training semantic parsers in English. Unfortunately, for languages other than English, annotated data is limited and so is the performance of the developed parsers. Recently, pretrained m ultilingual models have been proven useful for zero-shot cross-lingual transfer in many NLP tasks. What else does it require to apply a parser trained in English to other languages for zero-shot cross-lingual semantic parsing? Will simple language-independent features help? To this end, we experiment with six Discourse Representation Structure (DRS) semantic parsers in English, and generalize them to Italian, German and Dutch, where there are only a small number of manually annotated parses available. Extensive experiments show that despite its simplicity, adding Universal Dependency (UD) relations and Universal POS tags (UPOS) as model-agnostic features achieves surprisingly strong improvement on all parsers.

cross-lingual semantic parsing zero-shot cross-lingual semantic التحليل الدلالي عبر اللغات صفر النار عبر اللغات الدلالي صناعة حمض الفوسفور

AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate

491 - Association for Computation Linguistics 2021 مقالة

Non-autoregressive neural machine translation (NART) models suffer from the multi-modality problem which causes translation inconsistency such as token repetition. Most recent approaches have attempted to solve this problem by implicitly modeling dep endencies between outputs. In this paper, we introduce AligNART, which leverages full alignment information to explicitly reduce the modality of the target distribution. AligNART divides the machine translation task into (i) alignment estimation and (ii) translation with aligned decoder inputs, guiding the decoder to focus on simplified one-to-one translation. To alleviate the alignment estimation problem, we further propose a novel alignment decomposition method. Our experiments show that AligNART outperforms previous non-iterative NART models that focus on explicit modality reduction on WMT14 En↔De and WMT16 Ro→En. Furthermore, AligNART achieves BLEU scores comparable to those of the state-of-the-art connectionist temporal classification based models on WMT14 En↔De. We also observe that AligNART effectively addresses the token repetition problem even without sequence-level knowledge distillation.

non-autoregressive neural machine jointly learning learning to estimate الجهاز العصبي غير التلقائي التعلم المشترك تعلم تقدير صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment

من المحاذاة إلى المهمة: محاذاة كيان بسيطة غير مؤسس بسيطة

Ask ChatGPT about the research

Read More

suggested questions