Research papers, master and doctoral theses about Ensemble

Improving Decision Support Systems in Education Systems Using Data Mining and Machine Learning Techniques

2161 - Tishreen University 2021 ورقة بحثية

Educational data mining aims to study the available data in the educational field and extract the hidden knowledge from it in order to benefit from this knowledge in enhancing the education process and making successful decisions that will improve th e student’s academic performance. This study proposes the use of data mining techniques to improve student performance prediction. Three classification algorithms (Naïve Bayes,J48, Support Vector Machine) were applied to the student performance database, and then a new classifier was designed to combine the results of those individual classifiers using Voting Method. The WEKA tool was used, which supports a lot of data mining algorithms and methods. The results show that the ensemble classifier has the highest accuracy for predicting students' levels compared to other classifiers, as it has achieved a recognition accuracy of 74.8084%. The simple k-means clustering algorithm was useful in grouping similar students into separate groups, thus understanding the characteristics of each group, which helps to lead and direct each group separately.

العنقدة التنقيب عن البيانات خوارزميات العنقدة Ensemble Voting Method WEKA ويكا المزيد..

Devil's Advocate: Novel Boosting Ensemble Method from Psychological Findings for Text Classification

443 - Association for Computation Linguistics 2021 مقالة

We present a new form of ensemble method--Devil's Advocate, which uses a deliberately dissenting model to force other submodels within the ensemble to better collaborate. Our method consists of two different training settings: one follows the convent ional training process (Norm), and the other is trained by artificially generated labels (DevAdv). After training the models, Norm models are fine-tuned through an additional loss function, which uses the DevAdv model as a constraint. In making a final decision, the proposed ensemble model sums the scores of Norm models and then subtracts the score of the DevAdv model. The DevAdv model improves the overall performance of the other models within the ensemble. In addition to our ensemble framework being based on psychological background, it also shows comparable or improved performance on 5 text classification tasks when compared to conventional ensemble methods.

devil advocate boosting ensemble method psychological findings دافع الشيطان طريقة تعزيز الفرقة النتائج النفسية صناعة حمض الفوسفور المزيد..

Ensemble Fine-tuned mBERT for Translation Quality Estimation

1061 - Association for Computation Linguistics 2021 مقالة

Quality Estimation (QE) is an important component of the machine translation workflow as it assesses the quality of the translated output without consulting reference translations. In this paper, we discuss our submission to the WMT 2021 QE Shared Ta sk. We participate in Task 2 sentence-level sub-task that challenge participants to predict the HTER score for sentence-level post-editing effort. Our proposed system is an ensemble of multilingual BERT (mBERT)-based regression models, which are generated by fine-tuning on different input settings. It demonstrates comparable performance with respect to the Pearson's correlation, and beat the baseline system in MAE/ RMSE for several language pairs. In addition, we adapt our system for the zero-shot setting by exploiting target language-relevant language pairs and pseudo-reference translations.

translation quality estimation ensemble fine-tuned mbert تقدير جودة الترجمة فرقة صقل الناعم صناعة حمض الفوسفور

Named Entity Recognition in Historic Legal Text: A Transformer and State Machine Ensemble Method

726 - Association for Computation Linguistics 2021 مقالة

Older legal texts are often scanned and digitized via Optical Character Recognition (OCR), which results in numerous errors. Although spelling and grammar checkers can correct much of the scanned text automatically, Named Entity Recognition (NER) is challenging, making correction of names difficult. To solve this, we developed an ensemble language model using a transformer neural network architecture combined with a finite state machine to extract names from English-language legal text. We use the US-based English language Harvard Caselaw Access Project for training and testing. Then, the extracted names are subjected to heuristic textual analysis to identify errors, make corrections, and quantify the extent of problems. With this system, we are able to extract most names, automatically correct numerous errors and identify potential mistakes that can later be reviewed for manual correction.

عقود اللغة الإنجليزية historic legal text machine ensemble method النص القانوني التاريخي طريقة الفرقة آلة صناعة حمض الفوسفور

UR@NLP\_A\_Team @ GermEval 2021: Ensemble-based Classification of Toxic, Engaging and Fact-Claiming Comments

768 - Association for Computation Linguistics 2021 مقالة

In this paper, we report on our approach to addressing the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments for the German language. We submitted three runs for each subtask based on ensembles of three mo dels each using contextual embeddings from pre-trained language models using SVM and neural-network-based classifiers. We include language-specific as well as language-agnostic language models -- both with and without fine-tuning. We observe that for the runs we submitted that the SVM models overfitted the training data and this affected the aggregation method (simple majority voting) of the ensembles. The model records a lower performance on the test set than on the training set. Exploring the issue of overfitting we uncovered that due to a bug in the pipeline the runs we submitted had not been trained on the full set but only on a small training set. Therefore in this paper we also include the results we get when trained on the full training set which demonstrate the power of ensembles.

ensemble-based classification classification of toxic تصنيف القائم على الفرقة تصنيف السامة صناعة حمض الفوسفور

DeepBlueAI at SemEval-2021 Task 1: Lexical Complexity Prediction with A Deep Ensemble Approach

881 - Association for Computation Linguistics 2021 مقالة

Lexical complexity plays an important role in reading comprehension. lexical complexity prediction (LCP) can not only be used as a part of Lexical Simplification systems, but also as a stand-alone application to help people better reading. This paper presents the winning system we submitted to the LCP Shared Task of SemEval 2021 that capable of dealing with both two subtasks. We first perform fine-tuning on numbers of pre-trained language models (PLMs) with various hyperparameters and different training strategies such as pseudo-labelling and data augmentation. Then an effective stacking mechanism is applied on top of the fine-tuned PLMs to obtain the final prediction. Experimental results on the Complex dataset show the validity of our method and we rank first and second for subtask 2 and 1.

deep ensemble approach ensemble approach deep ensemble نهج الفرقة العميقة نهج الفرقة صناعة حمض الفوسفور

DeepBlueAI at WANLP-EACL2021 task 2: A Deep Ensemble-based Method for Sarcasm and Sentiment Detection in Arabic

711 - Association for Computation Linguistics 2021 مقالة

Sarcasm is one of the main challenges for sentiment analysis systems due to using implicit indirect phrasing for expressing opinions, especially in Arabic. This paper presents the system we submitted to the Sarcasm and Sentiment Detection task of WAN LP-2021 that is capable of dealing with both two subtasks. We first perform fine-tuning on two kinds of pre-trained language models (PLMs) with different training strategies. Then an effective stacking mechanism is applied on top of the fine-tuned PLMs to obtain the final prediction. Experimental results on ArSarcasm-v2 dataset show the effectiveness of our method and we rank third and second for subtask 1 and 2.

deep ensemble-based method deep ensemble-based sentiment detection task الأسلوب القائم على الفرقة العميقة الفرقة العميقة مهمة اكتشاف المعفاة صناعة حمض الفوسفور المزيد..

بناء نظام مساعد في اتخاذ القرار لحل مشكلة تسرب التلاميذ في مرحلة التعليم الإلزامي

2059 - Higher Institute for Applied Sciences and Technology 2017 رسالة ماجستير

Student dropout is a serious problem in education, there are many factors that can influence student dropout so it is not an easy issue to resolve. The scope of this research is to examine the accuracy of the ensemble techniques for predicting the st udent dropout, particularly for primary school students in the Syrian Arab Republic. The new classifier is designed based on the ensemble techniques “Stacking” and application of techniques Feature Selection where the database suffers from the problem of imbalance. This new classifier has been compared with individual ones by using the Cross-Validation technique, the study concluded that the proposed classifier is the best among the others that have been compared to predict the student dropout.

Feature Selection تسرب التلاميذ اختيار الميزة Ensemble Classifiers SMOTE Stacking Method Dropout Students المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد