Research papers, master and doctoral theses about تحديد الهوية

Error Identification for Machine Translation with Metric Embedding and Attention

226 - Association for Computation Linguistics 2021 مقالة

Quality Estimation (QE) for Machine Translation has been shown to reach relatively high accuracy in predicting sentence-level scores, relying on pretrained contextual embeddings and human-produced quality scores. However, the lack of explanations alo ng with decisions made by end-to-end neural models makes the results difficult to interpret. Furthermore, word-level annotated datasets are rare due to the prohibitive effort required to perform this task, while they could provide interpretable signals in addition to sentence-level QE outputs. In this paper, we propose a novel QE architecture which tackles both the word-level data scarcity and the interpretability limitations of recent approaches. Sentence-level and word-level components are jointly pretrained through an attention mechanism based on synthetic data and a set of MT metrics embedded in a common space. Our approach is evaluated on the Eval4NLP 2021 shared task and our submissions reach the first position in all language pairs. The extraction of metric-to-input attention weights show that different metrics focus on different parts of the source and target text, providing strong rationales in the decision-making process of the QE model.

خط أنابيب مستقلة identification for machine error identification تحديد الهوية تحديد الخطأ صناعة حمض الفوسفور

Data-driven Identification of Idioms in Song Lyrics

216 - Association for Computation Linguistics 2021 مقالة

The automatic recognition of idioms poses a challenging problem for NLP applications. Whereas native speakers can intuitively handle multiword expressions whose compositional meanings are hard to trace back to individual word semantics, there is stil l ample scope for improvement regarding computational approaches. We assume that idiomatic constructions can be characterized by gradual intensities of semantic non-compositionality, formal fixedness, and unusual usage context, and introduce a number of measures for these characteristics, comprising count-based and predictive collocation measures together with measures of context (un)similarity. We evaluate our approach on a manually labelled gold standard, derived from a corpus of German pop lyrics. To this end, we apply a Random Forest classifier to analyze the individual contribution of features for automatically detecting idioms, and study the trade-off between recall and precision. Finally, we evaluate the classifier on an independent dataset of idioms extracted from a list of Wikipedia idioms, achieving state-of-the art accuracy.

data-driven identification identification of idioms تحديد الهوية التي يحركها البيانات تحديد التعابير صناعة حمض الفوسفور

Complex words identification using word-level features for SemEval-2020 Task 1

203 - Association for Computation Linguistics 2021 مقالة

This article describes a system to predict the complexity of words for the Lexical Complexity Prediction (LCP) shared task hosted at SemEval 2021 (Task 1) with a new annotated English dataset with a Likert scale. Located in the Lexical Semantics trac k, the task consisted of predicting the complexity value of the words in context. A machine learning approach was carried out based on the frequency of the words and several characteristics added at word level. Over these features, a supervised random forest regression algorithm was trained. Several runs were performed with different values to observe the performance of the algorithm. For the evaluation, our best results reported a M.A.E score of 0.07347, M.S.E. of 0.00938, and R.M.S.E. of 0.096871. Our experiments showed that, with a greater number of characteristics, the precision of the classification increases.

complex words identification identification using word-level الكلمات المعقدة الهوية تحديد الهوية باستخدام مستوى Word صناعة حمض الفوسفور

HB Deid - HB De-identification tool demonstrator

410 - Association for Computation Linguistics 2021 مقالة

This paper describes a freely available web-based demonstrator called HB Deid. HB Deid identifies so-called protected health information, PHI, in a text written in Swedish and removes, masks, or replaces them with surrogates or pseudonyms. PHIs are n amed entities such as personal names, locations, ages, phone numbers, dates. HB Deid uses a CRF model trained on non-sensitive annotated text in Swedish, as well as a rule-based post-processing step for finding PHI. The final step in obscuring the PHI is then to either mask it, show only the class name or use a rule-based pseudonymisation system to replace it.

de-identification tool demonstrator de-identification tool deid أداة تعريف لتحديد الهوية أداة تحديد الهوية كد صناعة حمض الفوسفور المزيد..

BERT-based Multi-Task Model for Country and Province Level MSA and Dialectal Arabic Identification

260 - Association for Computation Linguistics 2021 مقالة

Dialect and standard language identification are crucial tasks for many Arabic natural language processing applications. In this paper, we present our deep learning-based system, submitted to the second NADI shared task for country-level and province -level identification of Modern Standard Arabic (MSA) and Dialectal Arabic (DA). The system is based on an end-to-end deep Multi-Task Learning (MTL) model to tackle both country-level and province-level MSA/DA identification. The latter MTL model consists of a shared Bidirectional Encoder Representation Transformers (BERT) encoder, two task-specific attention layers, and two classifiers. Our key idea is to leverage both the task-discriminative and the inter-task shared features for country and province MSA/DA identification. The obtained results show that our MTL model outperforms single-task models on most subtasks.

province level msa dialectal arabic identification dialectal arabic مستوى المحافظة MSA تحديد الهوية العربية الجدلي منطقيا عربي صناعة حمض الفوسفور المزيد..

Assessment of Different Commercial Brands of Tamoxifen 10 mg TabletsMarketed in Yemen as Anti-breast Cancer

1109 - Damascus University 2009 ورقة بحثية

Breast cancer is one of the most common cancers in women; one of nine women will have breast cancer in her life time. Tamoxifen is the trans-isomer of a triphenylethylene derivative. The aim of this study is to evaluate the quality and the quantit y of the commercial brands of Tamoxifen 10 mg tablets which are registered and marketed in Yemen.

سرطان الثدي breast cancer Spectrophotometer أقراص تاموكسيفين اختبار تحديد الهوية اختبار التفتت اختبار الذوبان موحودية الوزن المعايرة جهاز مقياس الطيف الضوئي Tamoxifen tablets Identification test disintegration test dissolution test Assay المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد