Do you want to publish a course? Click here

The ability of data mining to provide predictive information derived from huge databases became an effective tool in the hands of companies and individuals، allowing them to focus on areas that are important to them from the massive data generated by the march of their daily lives. Along with the increasing importance of this science there was a rapidly increasing in the tools that produced to implement the theory concepts as fast as possible. So it will be hard to take a decision on which of these tools is the best to perform the desired task. This study provides a comparison between the two most commonly used data mining tools according to opinion polls، namely: Rapidminer and R programming language in order to help researchers and developers to choose the best suited tool for them between the two. Adopted the comparison on seven criteria: platform، algorithms، input/output formats، visualization، user’s evaluation، infrastructure and potential development، and performance by applying a set of classification algorithms on a number of data sets and using two techniques to split data set: cross validation and hold-out to make sure of the results. The Results show that R supports the largest number of algorithms، input/output formats، and visualization. While Rapidminer superiority in terms of ease of use and support for a greater number of platforms. In terms of performance the accuracy of classification models that were built using the R packages were higher. That was not true in some cases imposed by the nature of the data because we did not added any pre-processing stage. Finally the preference option in any tool is depending on the extent of the user experience and purpose that the tool is used for
In this paper we introduce a comparison for some of data mining algorithm for traffic accidents analysis. We start by describing available data for entry by analyzing the structure of statistical reports in Lattakia traffic directorate, and proceed to data mining stage which enables us to smart study of factors that play roles in traffic accident and find its inter-relations and importance for causing traffic accident. That comes after building data warehouse upon the database we built to store the data we gathered. In this research we list a some of models was tested which is a sample of a many cases we checked to have the research results.
Association Rules is an important field in Data Mining, which is used to discover useful knowledge from a massive databases. Association Rules have been used to extract the information from the database transactions, and Apriori Algorithm is a pra ctical application for Association Rules and it is used to find frequent itemsets from database transactions. In this paper, we present a new improving on Apriori Algorithm by reduction generating of candidate itemsets and this leads to improving efficiency Apriori Algorithm.
In this research, we offered a new and simple way of Handwriting Characters Recognition. This way extracts positions of the black points from binary images (black, white) according to certain coordinates which are used in the stages of training an d testing. The extracted positions are stored in a database according to appropriate structure for predictive data mining. We used training data to build a predictive model which helps in Recognition testing data depending on the data stored in the database. We have conducted a number of tests on different samples of handwriting character images. We got accurate results, within the required conditions.
The main goal of data mining process is to extract information and discover knowledge from huge databases, where the clustering is one of the most important functionalities which can be done in this area. There are many of clustering algorithms an d methods, but determining or estimating the number of clusters which should be extracted from a dataset is one of the most important issues most of these methods encounter it. This research focuses on the problem of estimating number of clusters in the case of agglomerative hierarchical clustering. We present an evaluation of three of the most common methods used in estimating number of clusters.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا