Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Semantic similarity between tow sentences in arabic

ايجاد نسبة التشابه الدلالي بين جملتين باللغة العربية

2770 3 55 0 ( 0 )

Download Cite

Added by Damascus University ورقة بحثية

Publication date 2018

and research's language is العربية

Authors خديجة محمد( طالب ) - دانية سنقر( طالب ) - هبة الشرع( طالب ) - هديل أبو بكر( طالب )

Created by Khadija Mohammad

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Text Similarity is an important task in several application fields, such as information retrieval, plagiarism detection, machine translation, topic detection, text classification, text summarization and others. Finding similarity between two texts, paragraphs or sentences, is based on measuring, directly or indirectly, the similarity between words. There are two known types of words similarity: lexical and semantic. The first one handles the words as a stream of characters: words are similar lexically if they share the same characters in the same order. The second type aims to quantify the degree to which two words are semantically related. As an example they can be, synonyms, represent the same thing or they are used in the same context. In this article we focus our investigation on measuring the semantic similarity between Arabic sentences using several representations

Artificial intelligence review:

Upgrade your account to view the content

Research summary

تتناول هذه الورقة البحثية موضوع إيجاد نسبة التشابه الدلالي بين جملتين باللغة العربية، وهو موضوع ذو أهمية كبيرة في مجالات متعددة مثل استرجاع المعلومات، الكشف عن الانتحال، الترجمة الآلية، واستخراج المعلومات. تقدم الورقة عدة تقنيات لحساب هذا التشابه، مع التركيز على استخدام قاعدة بيانات معجمية تحتوي على جميع كلمات اللغة العربية وعلاقاتها. تتناول الورقة ثلاث طرق رئيسية لقياس التشابه: استخدام WordToVector، استخدام LMF Dictionaries، واستخدام خوارزمية Wu & Palmer. تتضمن كل طريقة مجموعة من الخطوات والتقنيات الفرعية مثل استخدام IDF وPOS_tagging لتحسين دقة النتائج. كما تستعرض الورقة كيفية تمثيل الكلمات كأشعة في فضاء متعدد الأبعاد واستخدام تقنيات مثل Word2vec وCBOW لتدريب النماذج على نصوص كبيرة. تقدم الورقة أيضًا مقارنة بين النتائج التي تم الحصول عليها باستخدام الطرق المختلفة وتوضح كيفية تحسين النتائج باستخدام تقنيات مثل IDF وPOS_tagging.

Critical review

تعتبر هذه الورقة خطوة مهمة نحو تحسين تقنيات معالجة اللغة الطبيعية باللغة العربية، وهي تقدم حلولًا مبتكرة ومفصلة لمشكلة حساب التشابه الدلالي بين الجمل. ومع ذلك، يمكن تحسين الورقة من خلال تقديم مزيد من التفاصيل حول كيفية اختيار المعايير المختلفة لتدريب النماذج، وكذلك تقديم أمثلة عملية توضح كيفية تطبيق هذه التقنيات في سياقات حقيقية. كما يمكن تحسين الورقة من خلال تقديم تحليل نقدي للقيود والتحديات التي تواجه هذه التقنيات، مثل التعامل مع اللهجات المختلفة للغة العربية والتحديات المرتبطة بمعالجة النصوص الكبيرة.

Questions related to the research

ما هي الأهمية الرئيسية لحساب التشابه الدلالي بين الجمل باللغة العربية؟

الأهمية الرئيسية لحساب التشابه الدلالي تكمن في تطبيقات متعددة مثل استرجاع المعلومات، الكشف عن الانتحال، الترجمة الآلية، واستخراج المعلومات.
ما هي الطرق الثلاث الرئيسية التي تم استخدامها في الورقة لقياس التشابه الدلالي؟

الطرق الثلاث الرئيسية هي: استخدام WordToVector، استخدام LMF Dictionaries، واستخدام خوارزمية Wu & Palmer.
ما هي التقنيات المستخدمة لتحسين دقة النتائج في حساب التشابه الدلالي؟

التقنيات المستخدمة تشمل IDF وPOS_tagging لتحسين دقة تحديد الكلمات التي تكون وصفية للغاية في كل جملة.
ما هي التحديات التي يمكن أن تواجه تقنيات حساب التشابه الدلالي بين الجمل باللغة العربية؟

التحديات تشمل التعامل مع اللهجات المختلفة للغة العربية والتحديات المرتبطة بمعالجة النصوص الكبيرة.

Keywords

التشابه الدلالي معالجة اللغة الطبيعية اللغة العربية Word2vec LMF Dictionaries خوارزمية Wu & Palmer IDF POS_tagging

References used

http://aclweb.org/anthology/W17-1303

https://en.wikipedia.org/wiki/Word2vec

https://github.com/bakrianoo/aravec

https://rd.springer.com/article/10.1007/s40595-016-0080-2

https://trac.research.cc.gatech.edu/ccl/export/158/SecondMindProject/SM/SM.WordNet/Paper/WordNetDotNet_Semantic_Similarity.pdf

rate research

Evaluation Datasets for Cross-lingual Semantic Textual Similarity

462 - Association for Computation Linguistics 2021 مقالة

Semantic textual similarity (STS) systems estimate the degree of the meaning similarity between two sentences. Cross-lingual STS systems estimate the degree of the meaning similarity between two sentences, each in a different language. State-of-the-a rt algorithms usually employ a strongly supervised, resource-rich approach difficult to use for poorly-resourced languages. However, any approach needs to have evaluation data to confirm the results. In order to simplify the evaluation process for poorly-resourced languages (in terms of STS evaluation datasets), we present new datasets for cross-lingual and monolingual STS for languages without this evaluation data. We also present the results of several state-of-the-art methods on these data which can be used as a baseline for further research. We believe that this article will not only extend the current STS research to other languages, but will also encourage competition on this new evaluation data.

semantic textual similarity cross-lingual semantic textual semantic textual التشابه الدلالي النصي النص الدلالي عبر اللغات نص الدلالي صناعة حمض الفوسفور المزيد..

350 - Association for Computation Linguistics 2021 مقالة

ROUGE is a widely used evaluation metric in text summarization. However, it is not suitable for the evaluation of abstractive summarization systems as it relies on lexical overlap between the gold standard and the generated summaries. This limitation becomes more apparent for agglutinative languages with very large vocabularies and high type/token ratios. In this paper, we present semantic similarity models for Turkish and apply them as evaluation metrics for an abstractive summarization task. To achieve this, we translated the English STSb dataset into Turkish and presented the first semantic textual similarity dataset for Turkish as well. We showed that our best similarity models have better alignment with average human judgments compared to ROUGE in both Pearson and Spearman correlations.

similarity based evaluation semantic similarity based based evaluation التقييم القائم على التشابه التشابه الدلالي مقرها تقييم مقرها صناعة حمض الفوسفور المزيد..

609 - Association for Computation Linguistics 2021 مقالة

For many NLP applications of online reviews, comparison of two opinion-bearing sentences is key. We argue that, while general purpose text similarity metrics have been applied for this purpose, there has been limited exploration of their applicabilit y to opinion texts. We address this gap in the literature, studying: (1) how humans judge the similarity of pairs of opinion-bearing sentences; and, (2) the degree to which existing text similarity metrics, particularly embedding-based ones, correspond to human judgments. We crowdsourced annotations for opinion sentence pairs and our main findings are: (1) annotators tend to agree on whether or not opinion sentences are similar or different; and (2) embedding-based metrics capture human judgments of opinion similarity'' but not opinion difference''. Based on our analysis, we identify areas where the current metrics should be improved. We further propose to learn a similarity metric for opinion similarity via fine-tuning the Sentence-BERT sentence-embedding network based on review text and weak supervision by review ratings. Experiments show that our learned metric outperforms existing text similarity metrics and especially show significantly higher correlations with human annotations for differing opinions.

opinion-bearing sentences opinion-bearing الجمل تحمل الرأي تشابه الرأي المحامل صناعة حمض الفوسفور

Looking for a Role for Word Embeddings in Eye-Tracking Features Prediction: Does Semantic Similarity Help?

467 - Association for Computation Linguistics 2021 مقالة

Eye-tracking psycholinguistic studies have suggested that context-word semantic coherence and predictability influence language processing during the reading activity. In this study, we investigate the correlation between the cosine similarities comp uted with word embedding models (both static and contextualized) and eye-tracking data from two naturalistic reading corpora. We also studied the correlations of surprisal scores computed with three state-of-the-art language models. Our results show strong correlation for the scores computed with BERT and GloVe, suggesting that similarity can play an important role in modeling reading times.

eye-tracking features prediction features prediction eye-tracking features ميزات تتبع العين التنبؤ ميزات التنبؤ ميزات تتبع العين صناعة حمض الفوسفور المزيد..

Semantic Answer Similarity for Evaluating Question Answering Models

398 - Association for Computation Linguistics 2021 مقالة

The evaluation of question answering models compares ground-truth annotations with model predictions. However, as of today, this comparison is mostly lexical-based and therefore misses out on answers that have no lexical overlap but are still semanti cally similar, thus treating correct answers as false. This underestimation of the true performance of models hinders user acceptance in applications and complicates a fair comparison of different models. Therefore, there is a need for an evaluation metric that is based on semantics instead of pure string similarity. In this short paper, we present SAS, a cross-encoder-based metric for the estimation of semantic answer similarity, and compare it to seven existing metrics. To this end, we create an English and a German three-way annotated evaluation dataset containing pairs of answers along with human judgment of their semantic similarity, which we release along with an implementation of the SAS metric and the experiments. We find that semantic similarity metrics based on recent transformer models correlate much better with human judgment than traditional lexical similarity metrics on our two newly created datasets and one dataset from related work.

evaluating question answering evaluating question تقييم الإجابة على السؤال تقييم السؤال صناعة حمض الفوسفور

comments (0)

no comments...

Al-Etihad University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Semantic similarity between tow sentences in arabic

ايجاد نسبة التشابه الدلالي بين جملتين باللغة العربية

Ask ChatGPT about the research

Read More