Paraphrase identification (PI), a fundamental task in natural language processing, is to identify whether two sentences express the same or similar meaning, which is a binary classification problem. Recently, BERT-like pre-trained language models hav e been a popular choice for the frameworks of various PI models, but almost all existing methods consider general domain text. When these approaches are applied to a specific domain, existing models cannot make accurate predictions due to the lack of professional knowledge. In light of this challenge, we propose a novel framework, namely , which can leverage the external unstructured Wikipedia knowledge to accurately identify paraphrases. We propose to mine outline knowledge of concepts related to given sentences from Wikipedia via BM25 model. After retrieving related outline knowledge, makes predictions based on both the semantic information of two sentences and the outline knowledge. Besides, we propose a gating mechanism to aggregate the semantic information-based prediction and the knowledge-based prediction. Extensive experiments are conducted on two public datasets: PARADE (a computer science domain dataset) and clinicalSTS2019 (a biomedical domain dataset). The results show that the proposed outperforms state-of-the-art methods.
Precisely defining the terminology is the first step in scientific communication. Developing neural text generation models for definition generation can circumvent the labor-intensity curation, further accelerating scientific discovery. Unfortunately , the lack of large-scale terminology definition dataset hinders the process toward definition generation. In this paper, we present a large-scale terminology definition dataset Graphine covering 2,010,648 terminology definition pairs, spanning 227 biomedical subdisciplines. Terminologies in each subdiscipline further form a directed acyclic graph, opening up new avenues for developing graph-aware text generation models. We then proposed a novel graph-aware definition generation model Graphex that integrates transformer with graph neural network. Our model outperforms existing text generation models by exploiting the graph structure of terminologies. We further demonstrated how Graphine can be used to evaluate pretrained language models, compare graph representation learning methods and predict sentence granularity. We envision Graphine to be a unique resource for definition generation and many other NLP tasks in biomedicine.
Definition modelling is the task of automatically generating a dictionary-style definition given a target word. In this paper, we consider cross-lingual definition generation. Specifically, we generate English definitions for Wolastoqey (Malecite-Pas samaquoddy) words. Wolastoqey is an endangered, low-resource polysynthetic language. We hypothesize that sub-word representations based on byte pair encoding (Sennrich et al., 2016) can be leveraged to represent morphologically-complex Wolastoqey words and overcome the challenge of not having large corpora available for training. Our experimental results demonstrate that this approach outperforms baseline methods in terms of BLEU score. 
This paper describes a freely available web-based demonstrator called HB Deid. HB Deid identifies so-called protected health information, PHI, in a text written in Swedish and removes, masks, or replaces them with surrogates or pseudonyms. PHIs are n amed entities such as personal names, locations, ages, phone numbers, dates. HB Deid uses a CRF model trained on non-sensitive annotated text in Swedish, as well as a rule-based post-processing step for finding PHI. The final step in obscuring the PHI is then to either mask it, show only the class name or use a rule-based pseudonymisation system to replace it.
This paper describes the results of the shared tasks organized as part of the VarDial Evaluation Campaign 2021. The campaign was part of the eighth workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with EACL 2021. Four separate shared tasks were included this year: Dravidian Language Identification (DLI), Romanian Dialect Identification (RDI), Social Media Variety Geolocation (SMG), and Uralic Language Identification (ULI). DLI was organized for the first time and the other three continued a series of tasks from previous evaluation campaigns.
يقصد بالقانون المدني مجموعة القواعد التي تنظم العالقات بين الاشخاص، عدا ما يتناوله بالتنظيم فرع آخر من فروع القانون الخاص. والقانون المدني هو أصل القانون الخاص، أما الفروع الأخرى كالقانون التجاري، وقانون العمل، فقد تفرعت عنه، لأن تطور حاجات الناس في المجتمع، اقتضى أن توجد أحكاما خاصة بمهنة معينة أو بنشاط معين منفصلة عن القانون المدني. والقانون المدني يتناول بالتنظيم نوعين من الروابط: الأحوال الشخصية من جهة والأحوال العينية المتعلقة بالمال أو المعامالت المالية من جهة أخرى ولذلك جاء في التعريف عبارة العالقات على عمومها في التعريف أعاله، ولم يضف إليها نعتا يفيد أنها عالقات مالية. لان اضافة هذا النعت تعني أن القانون المدني هو قانون الاحوال المالية مما يقتضي الفصل بينه وبين الاحوال الشخصية كما هو الامر عندنا في المغرب. في حين أن القانون المدني في بعض الدول الاوروبية كقواعد يفيد جميع القواعد المنظمة لعلاقات الناس سواء في إطار الاحوال الشخصية أو الاحوال العينية. وما دامت قواعد الاحوال الشخصية تعتبر قواعد مدنية في المغرب، فيجب التمييز بينهما وبين الاحوال العينية.
شملت هذه الدراسة حصر و تعريف 80 نوعاً و تحت نوع واحد من الأعشاب تنتمي لـ 64 جنساً موزعة على 28 فصيلة ضمن ثلاثة بساتين للحمضيات في منطقة اللاذقية على مدار أربعة فصول خلال الفترة الواقعة من أيلول 2014 حتى آب 2015. تم حساب كل من الكثافة و الكثافة النسبي ة و التردد و التردد النسبي لكل نوع عشبي لمعرفة تركيب الغطاء العشبي و أهمية الأنواع المنتشرة. كانت نسبة الأنواع التابعة لأحاديات الفلقة 24.69%، أما التابعة لثنائيات الفلقة كانت نسبتها 75.31%، و شكلت الأنواع الحولية نسبة كبيرة بلغت 85.19% و المعمرة 13.58% و ثنائية الحول 1.23%. دلت النتائج أن أكثر أنواع الأعشاب الموجودة في بساتين الحمضيات تتبع الفصيلة الكلئية (Poaceae)، حيث تضمنت 17 نوعاً و تحت نوع واحد، و تلتها الفصيلة الفولية (Fabaceae (9 أنواع و الفصيلة النجمية (Asteraceae (8 أنواع و الفصيلة الحليبية (Euphorbiaceae (7 أنواع. وكان أكثر أنواع أحادية الفلقة كثافةً نوع السعد Cyperus rotundus L. بكثافة بلغت 20.2 نبات/م2 خلال فصل الصيف، أما أكثر أعشاب ثنائيات الفلقة كثافةً عشبة حشيشة الزئبق Mercurialis annua L. بكثافة59.27 نبات/م2 خلال فصل الخريف. و من أهم نتائج الدراسة أيضا إضافة 3 أنواع، و تحت نوع واحد من الفصيلة الكلئية (Poaceae) للفلورا السورية.
This study was conducted to determine the prevalence and stages of chronic kidney disease and to identify common causes of chronic kidney disease in patients as well as the study of risk factors for the disease. The study included 1314 patients have been admitted to the Department of Internal Medicine at al-Assad University Hospital in Lattakia where they were diagnosed with chronic renal disease patients at 120 of them by 9.1%. The causes of chronic kidney disease were: diabetes 41.7%; hypertension 30%; Glomerulonephritis 11.7%; obstructive uropathy 5%; and Glomerugenetic disease 3.3%; mm3.3%; polycystic kidney d isease3.3% ; idiopathic 1.7%. Chronic kidney disease stages 1; 2; 3; 4; 5; were: 10%; 21.7%; 33.3% 20%; 15% respectively. Risk factors were: advanced age (over 50) 75%; high blood sugar 53.3%; lack of blood albumin 65%; hypertension 38.3%; family history of chronic kidney disease 25%; high triglycerides and cholesterol 11.7% .
Raising number of researches dealt with precipitation properties especially after the recent advances in measurement techniques and devices. It is becoming essential to reach a common understanding of rain event when addressing the relation of rain p roperties with different climate patterns and its influence on variety of human activities. The aim of this research is to present a suitable rain event definition that would serve future research in this field. Data was acquired in Freising south of Germany in the summer of 2009. Four event definitions were generated then compared according to rain properties obtained by the disdrometer and the rain gauge, These properties included event count per definition, mean event duration, mean event rain intensity, mean event rain amount, total rain amount and total number of drops. One definition proved to be more suitable than the others exploiting the disdrometer precession.
The theoretical developments in archaeology have influenced the nature of cultural inferences that can be achieved by studying material culture. Since the fifties of last century the aims of archaeology were beyond identifying the cultural-histori cal context of material culture. Instead the focus was inferring cultural aspects from artifacts and testing assumptions on material culture. To reach such a research end, the relationships between human behavior and material culture should be more identified. Moreover, the evaluation of archaeological assumptions based on material foundation ought to be measured in a context where both human behavior and material culture can be directly observed. Ethnoarchaeological studies, therefore, have been developed to clearly identify human-material relationships and to testify the archaeological assumption where behaviors can be directly observed and to identify the factors that can affect these behaviors and their material correlates. Despite the fact that ethnoarchaeology has been intensively practiced in most parts of the world, less studies have been carried out in the Levant. Hence, this paper aims at presenting the nature and conceptualization of ethnoarchaeology, the main topics that have been studied in this part of the world and how to use such studies for archaeological reasoning. Moreover, it aims to suggest further research aspects that can be studied and how to use such studies with archaeological and historical sources to conceptualize the past in the Levant from inside.

