Research papers, master and doctoral theses about Features

CNNBiF: CNN-based Bigram Features for Named Entity Recognition

718 - Association for Computation Linguistics 2021 مقالة

Transformer models fine-tuned with a sequence labeling objective have become the dominant choice for named entity recognition tasks. However, a self-attention mechanism with unconstrained length can fail to fully capture local dependencies, particula rly when training data is limited. In this paper, we propose a novel joint training objective which better captures the semantics of words corresponding to the same entity. By augmenting the training objective with a group-consistency loss component we enhance our ability to capture local dependencies while still enjoying the advantages of the unconstrained self-attention mechanism. On the CoNLL2003 dataset, our method achieves a test F1 of 93.98 with a single transformer model. More importantly our fine-tuned CoNLL2003 model displays significant gains in generalization to out of domain datasets: on the OntoNotes subset we achieve an F1 of 72.67 which is 0.49 points absolute better than the baseline, and on the WNUT16 set an F1 of 68.22 which is a gain of 0.48 points. Furthermore, on the WNUT17 dataset we achieve an F1 of 55.85, yielding a 2.92 point absolute improvement.

cnn-based bigram features bigram features ميزات Bigram مقرها CNN ميزات بدرام صناعة حمض الفوسفور

``Don't discuss'': Investigating Semantic and Argumentative Features for Supervised Propagandist Message Detection and Classification

463 - Association for Computation Linguistics 2021 مقالة

One of the mechanisms through which disinformation is spreading online, in particular through social media, is by employing propaganda techniques. These include specific rhetorical and psychological strategies, ranging from leveraging on emotions to exploiting logical fallacies. In this paper, our goal is to push forward research on propaganda detection based on text analysis, given the crucial role these methods may play to address this main societal issue. More precisely, we propose a supervised approach to classify textual snippets both as propaganda messages and according to the precise applied propaganda technique, as well as a detailed linguistic analysis of the features characterising propaganda information in text (e.g., semantic, sentiment and argumentation features). Extensive experiments conducted on two available propagandist resources (i.e., NLP4IF'19 and SemEval'20-Task 11 datasets) show that the proposed approach, leveraging different language models and the investigated linguistic features, achieves very promising results on propaganda classification, both at sentence- and at fragment-level.

investigating semantic argumentative features propagandist message detection التحقيق الدلالي ميزات جدلية كشف الرسائل الدعائية صناعة حمض الفوسفور المزيد..

Compound or Term Features? Analyzing Salience in Predicting the Difficulty of German Noun Compounds across Domains

586 - Association for Computation Linguistics 2021 مقالة

Predicting the difficulty of domain-specific vocabulary is an important task towards a better understanding of a domain, and to enhance the communication between lay people and experts. We investigate German closed noun compounds and focus on the int eraction of compound-based lexical features (such as frequency and productivity) and terminology-based features (contrasting domain-specific and general language) across word representations and classifiers. Our prediction experiments complement insights from classification using (a) manually designed features to characterise termhood and compound formation and (b) compound and constituent word embeddings. We find that for a broad binary distinction into easy' vs. difficult' general-language compound frequency is sufficient, but for a more fine-grained four-class distinction it is crucial to include contrastive termhood features and compound and constituent features.

german noun compounds term features predicting the difficulty مركبات الاسم الألمانية ميزات المصطلح التنبؤ بالصعوبة صناعة حمض الفوسفور المزيد..

LCP-RIT at SemEval-2021 Task 1: Exploring Linguistic Features for Lexical Complexity Prediction

671 - Association for Computation Linguistics 2021 مقالة

This paper describes team LCP-RIT's submission to the SemEval-2021 Task 1: Lexical Complexity Prediction (LCP). The task organizers provided participants with an augmented version of CompLex (Shardlow et al., 2020), an English multi-domain dataset in which words in context were annotated with respect to their complexity using a five point Likert scale. Our system uses logistic regression and a wide range of linguistic features (e.g. psycholinguistic features, n-grams, word frequency, POS tags) to predict the complexity of single words in this dataset. We analyze the impact of different linguistic features on the classification performance and we evaluate the results in terms of mean absolute error, mean squared error, Pearson correlation, and Spearman correlation.

انحدار التعقيد المعجمي exploring linguistic features استكشاف الميزات اللغوية صناعة حمض الفوسفور

Improving Hate Speech Type and Target Detection with Hateful Metaphor Features

707 - Association for Computation Linguistics 2021 مقالة

We study the usefulness of hateful metaphorsas features for the identification of the type and target of hate speech in Dutch Facebook comments. For this purpose, all hateful metaphors in the Dutch LiLaH corpus were annotated and interpreted in line with Conceptual Metaphor Theory and Critical Metaphor Analysis. We provide SVM and BERT/RoBERTa results, and investigate the effect of different metaphor information encoding methods on hate speech type and target detection accuracy. The results of the conducted experiments show that hateful metaphor features improve model performance for the both tasks. To our knowledge, it is the first time that the effectiveness of hateful metaphors as an information source for hatespeech classification is investigated.

hate speech type improving hate speech hateful metaphor features نوع الكلام الكراهية تحسين خطاب الكراهية ميزات الاستعارة البغيضة صناعة حمض الفوسفور المزيد..

Learning to Recognize Dialect Features

873 - Association for Computation Linguistics 2021 مقالة

Building NLP systems that serve everyone requires accounting for dialect differences. But dialects are not monolithic entities: rather, distinctions between and within dialects are captured by the presence, absence, and frequency of dozens of dialect features in speech and text, such as the deletion of the copula in He ∅ running''. In this paper, we introduce the task of dialect feature detection, and present two multitask learning approaches, both based on pretrained transformers. For most dialects, large-scale annotated corpora for these features are unavailable, making it difficult to train recognizers. We train our models on a small number of minimal pairs, building on how linguists typically define dialect features. Evaluation on a test set of 22 dialect features of Indian English demonstrates that these models learn to recognize many features with high accuracy, and that a few minimal pairs can be as effective for training as thousands of labeled examples. We also demonstrate the downstream applicability of dialect feature detection both as a measure of dialect density and as a dialect classifier.

dialect features dialect feature detection ميزات اللهجة لهجة كشف ميزة اللهجة صناعة حمض الفوسفور

Looking for a Role for Word Embeddings in Eye-Tracking Features Prediction: Does Semantic Similarity Help?

780 - Association for Computation Linguistics 2021 مقالة

Eye-tracking psycholinguistic studies have suggested that context-word semantic coherence and predictability influence language processing during the reading activity. In this study, we investigate the correlation between the cosine similarities comp uted with word embedding models (both static and contextualized) and eye-tracking data from two naturalistic reading corpora. We also studied the correlations of surprisal scores computed with three state-of-the-art language models. Our results show strong correlation for the scores computed with BERT and GloVe, suggesting that similarity can play an important role in modeling reading times.

eye-tracking features prediction features prediction eye-tracking features ميزات تتبع العين التنبؤ ميزات التنبؤ ميزات تتبع العين صناعة حمض الفوسفور المزيد..

Algorithm designing for Features Extraction from CAD Files for Industrial Processes

1467 - Tishreen University 2018 ورقة بحثية

In this paper, the algorithm was designed for cylinders, slots and pockets extraction from CAD models saved in STL file depending on rule-based method and graph-based method. Besides, windows application was designed using Visual Studio C# which al lows the user to import CAD model and features extraction and view their geometric information (cylinder diameter, height, cylinder center coordinates, width, height, length for slots and pockets. In addition, all surfaces that the feature consists from. The proposed algorithm consists from multi-steps are: dividing input model into multi surfaces based on RegionGrowing method, next step is cylinder features extraction depending on rule-based method, slots and bockets extraction depending on graph-based method, calculating geometric information for each extracted feature. The results show that the proposed algorithm can extract cylinders, slots and pockets features from CAD models which saved in STL files and calculates geometric information for each extracted feature.

Features extraction STL استخراج المعالم التمثيل البياني الاعتماد على القواعد Rule based Algorithims Graph based Algorithims المزيد..

Improvement learning rules for Relations Extraction from text

1668 - Aِl-Baath University 2018 ورقة بحثية

relation extraction systems have made extensive use of features generated by linguistic analysis modules. Errors in these features lead to errors of relation detection and classification. In this work, we depart from these traditional approaches w ith complicated feature engineering by introducing a convolutional neural network for relation extraction that automatically learns features from sentences and minimizes the dependence on external toolkits and resources. Our model takes advantages of multiple window sizes for filters and pre-trained word embeddings as an initializer on a nonstatic architecture to improve the performance.

relation extraction استخلاص العلاقات هندسة المميزات تضمينات الكلمة الشبكات العصبونية الالتفافية features engineering word embeddings convolutional neural network المزيد..

The tale on the tongue of animals in Lafontaine's poetry

3852 - Aِl-Baath University 2017 ورقة بحثية

The research refers to the artistic and stylistic features of the story of the animal Lafontaine's poetry, which is the multiplicity of sources and diversity of stories, spirit of humor, the varied rhythm of music, the story and its relation to social criticism.

ابن المقفع الحكاية الحيوان لافونتين السمات الفنية إيسوب the tale the animal Lafontaine the artistic features Isop ibn al- Muqaffa المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد