Adaptive Spam Detection Inspired by a Cross-Regulation Model of Immune Dynamics: A Study of Concept Drift

142 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Alaa Abi Haidar

تاريخ النشر 2008

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Alaa Abi-Haidar - Luis M. Rocha

الذكاء الاصطناعي استرجاع المعلومات أنظمة التكيف والتنظيم الذاتي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper proposes a novel solution to spam detection inspired by a model of the adaptive immune system known as the crossregulation model. We report on the testing of a preliminary algorithm on six e-mail corpora. We also compare our results statically and dynamically with those obtained by the Naive Bayes classifier and another binary classification method we developed previously for biomedical text-mining applications. We show that the cross-regulation model is competitive against those and thus promising as a bio-inspired algorithm for spam detection in particular, and binary classification in general.

قيم البحث

72 - Yiming Xu , Diego Klabjan 2020

In model serving, having one fixed model during the entire often life-long inference process is usually detrimental to model performance, as data distribution evolves over time, resulting in lack of reliability of the model trained on historical data . It is important to detect changes and retrain the model in time. The existing methods generally have three weaknesses: 1) using only classification error rate as signal, 2) assuming ground truth labels are immediately available after features from samples are received and 3) unable to decide what data to use to retrain the model when change occurs. We address the first problem by utilizing six different signals to capture a wide range of characteristics of data, and we address the second problem by allowing lag of labels, where labels of corresponding features are received after a lag in time. For the third problem, our proposed method automatically decides what data to use to retrain based on the signals. Extensive experiments on structured and unstructured data for different type of data changes establish that our method consistently outperforms the state-of-the-art methods by a large margin.

الذكاء الاصطناعي

A Knowledge Mining Model for Ranking Institutions using Rough Computing with Ordering Rules and Formal Concept analysis

326 - D. P. Acharjya , , L. Ezhilarasi 2011

Emergences of computers and information technological revolution made tremendous changes in the real world and provides a different dimension for the intelligent data analysis. Well formed fact, the information at right time and at right place deploy a better knowledge.However, the challenge arises when larger volume of inconsistent data is given for decision making and knowledge extraction. To handle such imprecise data certain mathematical tools of greater importance has developed by researches in recent past namely fuzzy set, intuitionistic fuzzy set, rough Set, formal concept analysis and ordering rules. It is also observed that many information system contains numerical attribute values and therefore they are almost similar instead of exact similar. To handle such type of information system, in this paper we use two processes such as pre process and post process. In pre process we use rough set on intuitionistic fuzzy approximation space with ordering rules for finding the knowledge whereas in post process we use formal concept analysis to explore better knowledge and vital factors affecting decisions.

الذكاء الاصطناعي استرجاع المعلومات

CURIE: A Cellular Automaton for Concept Drift Detection

252 - Jesus L. Lobo , Javier Del Ser , Eneko Osaba 2020

Data stream mining extracts information from large quantities of data flowing fast and continuously (data streams). They are usually affected by changes in the data distribution, giving rise to a phenomenon referred to as concept drift. Thus, learnin g models must detect and adapt to such changes, so as to exhibit a good predictive performance after a drift has occurred. In this regard, the development of effective drift detection algorithms becomes a key factor in data stream mining. In this work we propose CU RIE, a drift detector relying on cellular automata. Specifically, in CU RIE the distribution of the data stream is represented in the grid of a cellular automata, whose neighborhood rule can then be utilized to detect possible distribution changes over the stream. Computer simulations are presented and discussed to show that CU RIE, when hybridized with other base learners, renders a competitive behavior in terms of detection metrics and classification accuracy. CU RIE is compared with well-established drift detectors over synthetic datasets with varying drift characteristics.

التعلم الآلي التعلم الالي

Adaptive Summaries: A Personalized Concept-based Summarization Approach by Learning from Users Feedback

67 - Samira Ghodratnama , Mehrdad Zakershahrak , Fariborz Sobhanmanesh 2020

Exploring the tremendous amount of data efficiently to make a decision, similar to answering a complicated question, is challenging with many real-world application scenarios. In this context, automatic summarization has substantial importance as it will provide the foundation for big data analytic. Traditional summarization approaches optimize the system to produce a short static summary that fits all users that do not consider the subjectivity aspect of summarization, i.e., what is deemed valuable for different users, making these approaches impractical in real-world use cases. This paper proposes an interactive concept-based summarization model, called Adaptive Summaries, that helps users make their desired summary instead of producing a single inflexible summary. The system learns from users provided information gradually while interacting with the system by giving feedback in an iterative loop. Users can choose either reject or accept action for selecting a concept being included in the summary with the importance of that concept from users perspectives and confidence level of their feedback. The proposed approach can guarantee interactive speed to keep the user engaged in the process. Furthermore, it eliminates the need for reference summaries, which is a challenging issue for summarization tasks. Evaluations show that Adaptive Summaries helps users make high-quality summaries based on their preferences by maximizing the user-desired content in the generated summaries.

الذكاء الاصطناعي

Automatic Learning to Detect Concept Drift

87 - Hang Yu , Tianyu Liu , Jie Lu 2021

Many methods have been proposed to detect concept drift, i.e., the change in the distribution of streaming data, due to concept drift causes a decrease in the prediction accuracy of algorithms. However, the most of current detection methods are based on the assessment of the degree of change in the data distribution, cannot identify the type of concept drift. In this paper, we propose Active Drift Detection with Meta learning (Meta-ADD), a novel framework that learns to classify concept drift by tracking the changed pattern of error rates. Specifically, in the training phase, we extract meta-features based on the error rates of various concept drift, after which a meta-detector is developed via a prototypical neural network by representing various concept drift classes as corresponding prototypes. In the detection phase, the learned meta-detector is fine-tuned to adapt to the corresponding data stream via stream-based active learning. Hence, Meta-ADD uses machine learning to learn to detect concept drifts and identify their types automatically, which can directly support drift understand. The experiment results verify the effectiveness of Meta-ADD.

الذكاء الاصطناعي