An Application of Bayesian classification to Interval Encoded Temporal mining with prioritized items

554 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل R Doomun

تاريخ النشر 2009

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف C. Balasubramanian - K. Duraiswamy

قواعد البيانات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In real life, media information has time attributes either implicitly or explicitly known as temporal data. This paper investigates the usefulness of applying Bayesian classification to an interval encoded temporal database with prioritized items. The proposed method performs temporal mining by encoding the database with weighted items which prioritizes the items according to their importance from the user perspective. Naive Bayesian classification helps in making the resulting temporal rules more effective. The proposed priority based temporal mining (PBTM) method added with classification aids in solving problems in a well informed and systematic manner. The experimental results are obtained from the complaints database of the telecommunications system, which shows the feasibility of this method of classification based temporal mining.

قيم البحث

156 - Rebecca C. Steorts , Anshumali Shrivastava 2018

Entity resolution seeks to merge databases as to remove duplicate entries where unique identifiers are typically unknown. We review modern blocking approaches for entity resolution, focusing on those based upon locality sensitive hashing (LSH). First , we introduce $k$-means locality sensitive hashing (KLSH), which is based upon the information retrieval literature and clusters similar records into blocks using a vector-space representation and projections. Second, we introduce a subquadratic variant of LSH to the literature, known as Densified One Permutation Hashing (DOPH). Third, we propose a weighted variant of DOPH. We illustrate each method on an application to a subset of the ongoing Syrian conflict, giving a discussion of each method.

قواعد البيانات التعلم الآلي تطبيقات الإحصاء

812 - M. S. Danessh , C. Balasubramanian , K. Duraiswamy 2010

Data mining has been widely recognized as a powerful tool to explore added value from large-scale databases. Finding frequent item sets in databases is a crucial in data mining process of extracting association rules. Many algorithms were developed t o find the frequent item sets. This paper presents a summary and a comparative study of the available FP-growth algorithm variations produced for mining frequent item sets showing their capabilities and efficiency in terms of time and memory consumption on association rule mining by taking application of specific information into account. It proposes pattern growth mining paradigm based FP-tree growth algorithm, which employs a tree structure to compress the database. The performance study shows that the anti- FP-growth method is efficient and scalable for mining both long and short frequent patterns and is about an order of magnitude faster than the Apriority algorithm and also faster than some recently reported new frequent-pattern mining.

قواعد البيانات

A Benchmark to Select Data Mining Based Classification Algorithms For Business Intelligence And Decision Support Systems

527 - Pardeep Kumar , Nitin , Vivek Kumar Sehgal 2012

DSS serve the management, operations, and planning levels of an organization and help to make decisions, which may be rapidly changing and not easily specified in advance. Data mining has a vital role to extract important information to help in decis ion making of a decision support system. Integration of data mining and decision support systems (DSS) can lead to the improved performance and can enable the tackling of new types of problems. Artificial Intelligence methods are improving the quality of decision support, and have become embedded in many applications ranges from ant locking automobile brakes to these days interactive search engines. It provides various machine learning techniques to support data mining. The classification is one of the main and valuable tasks of data mining. Several types of classification algorithms have been suggested, tested and compared to determine the future trends based on unseen data. There has been no single algorithm found to be superior over all others for all data sets. The objective of this paper is to compare various classification algorithms that have been frequently used in data mining for decision support systems. Three decision trees based algorithms, one artificial neural network, one statistical, one support vector machines with and without ada boost and one clustering algorithm are tested and compared on four data sets from different domains in terms of predictive accuracy, error rate, classification index, comprehensibility and training time. Experimental results demonstrate that Genetic Algorithm (GA) and support vector machines based algorithms are better in terms of predictive accuracy. SVM without adaboost shall be the first choice in context of speed and predictive accuracy. Adaboost improves the accuracy of SVM but on the cost of large training time.

قواعد البيانات التعلم الآلي

Analyzing hierarchical multi-view MRI data with StaPLR: An application to Alzheimers disease classification

103 - Wouter van Loon , Frank de Vos , Marjolein Fokkema 2021

Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classifica tion and automatically selecting the views that are most important for prediction. We show how this method can easily be extended to a setting where the data has a hierarchical multi-view structure. We apply StaPLR to Alzheimers disease classification where different MRI measures have been calculated from three scan types: structural MRI, diffusion-weighted MRI, and resting-state fMRI. StaPLR can identify which scan types and which MRI measures are most important for classification, and it outperforms elastic net regression in classification performance.

المنهجية التعلم الآلي تطبيقات الإحصاء

Mining Rules Incrementally over Large Knowledge Bases

184 - Xiaofeng Zhou , Ali Sadeghian , Daisy Zhe Wang 2019

Multiple web-scale Knowledge Bases, e.g., Freebase, YAGO, NELL, have been constructed using semi-supervised or unsupervised information extraction techniques and many of them, despite their large sizes, are continuously growing. Much research effort has been put into mining inference rules from knowledge bases. To address the task of rule mining over evolving web-scale knowledge bases, we propose a parallel incremental rule mining framework. Our approach is able to efficiently mine rules based on the relational model and apply updates to large knowledge bases; we propose an alternative metric that reduces computation complexity without compromising quality; we apply multiple optimization techniques that reduce runtime by more than 2 orders of magnitude. Experiments show that our approach efficiently scales to web-scale knowledge bases and saves over 90% time compared to the state-of-the-art batch rule mining system. We also apply our optimization techniques to the batch rule mining algorithm, reducing runtime by more than half compared to the state-of-the-art. To the best of our knowledge, our incremental rule mining system is the first that handles updates to web-scale knowledge bases.

قواعد البيانات التعلم الآلي