Do you want to publish a course? Click here

With the tremendous development in all areas of scientific, economic, political and other appeared the need to find nontraditional ways in which to deal with all the data patterns (text, video and audio, etc.), which are becoming very large volumes these days. Was necessary to find new ways to develop knowledge and information hidden within this huge amount of data such as query for customers who have habits of purchasing the same or prospects for the sale of a particular commodity in one of the geographical areas and other queries deductive and based on the technology of data mining. The process of exploration in several of the most important methods of clustering method (assembly) Clustering, which are several algorithms. We will focus in this research on the use of a way calculated to create centers of First Instance of the algorithm K-Medoids which is based on the principle of the division of data into clusters each cluster contains a replica database easy to handle, rather than selected as random which in turn leads to the emergence of different results and slow in the implementation of the algorithm.
The algorithm classifies objects to a predefined number of clusters, which is given by the user (assume k clusters). The idea is to choose random cluster centers, one for each cluster. These centers are preferred to be as far as possible from each ot her. Starting points affect the clustering process and results. Here the Centroid initialization plays an important role in determining the cluster assignment in effective way. Also, the convergence behavior of clustering is based on the initial centroid values assigned. This research focuses on the assignment of cluster centroid selection so as to improve the clustering performance by K-Means clustering algorithm. This research uses Initial Cluster Centers Derived from Data Partitioning along the Data Axis with the Highest Variance to assign for cluster centroid.
The ability of data mining to provide predictive information derived from huge databases became an effective tool in the hands of companies and individuals، allowing them to focus on areas that are important to them from the massive data generated by the march of their daily lives. Along with the increasing importance of this science there was a rapidly increasing in the tools that produced to implement the theory concepts as fast as possible. So it will be hard to take a decision on which of these tools is the best to perform the desired task. This study provides a comparison between the two most commonly used data mining tools according to opinion polls، namely: Rapidminer and R programming language in order to help researchers and developers to choose the best suited tool for them between the two. Adopted the comparison on seven criteria: platform، algorithms، input/output formats، visualization، user’s evaluation، infrastructure and potential development، and performance by applying a set of classification algorithms on a number of data sets and using two techniques to split data set: cross validation and hold-out to make sure of the results. The Results show that R supports the largest number of algorithms، input/output formats، and visualization. While Rapidminer superiority in terms of ease of use and support for a greater number of platforms. In terms of performance the accuracy of classification models that were built using the R packages were higher. That was not true in some cases imposed by the nature of the data because we did not added any pre-processing stage. Finally the preference option in any tool is depending on the extent of the user experience and purpose that the tool is used for
The use of traditional methods to analyze massive amounts of data sets is not conducive to the discovery of new knowledge patterns supports the decision-making process So the purpose of this article is designed visual analysis system that supports analysis of data sets through the use of automated analysis, which includes many of the techniques such as assembly process (clustering) and Altnsnev (classification) and the correlation base (association Rule) And the process of visual data exploration techniques Manifesting, and then the comparison with other data sets manifestation techniques and evaluation of the proposed Manifesting system.
نتيجةً للتطور الهائل في العلوم والتكنولوجيا، والانتشار الواسع للإنترنت، باتت المعرفة البشرية في متناول كل شخص منا. لكن ومع هذا الكم الهائل من المعلومات، اصبح القارئ مشتتا بين مصادر عديدة تجعله يضيع في هذا الفضاء الواسع. انفجار المعلومات هذا تطلب وسائ ل للسيطرة عليه تقوم بتنظيم هذه المعلومات وترتيبها تحت عناوين عريضة، وتتتبعها. من هنا بدء المجتمع التقني بالاتجاه نحو مجال جديد اطلق عليه اسم اكتشاف الموضوع وتتبعه. يطبق هذا المفهوم بشكل واسع في مجال شبكات التواصل الاجتماعي، الاخبار، المقالات العلمية وغيرها الكثير. ففي مجال الاخبار كثيرا ما ترى آلاف وكالات الاخبار تبث عشرات الاف القصص الاخبارية حول نفس الحدث، ما دفع البوابات الاخبارية وفي مقدمتها Google news لتطبيق نظام اكتشاف للموضوع وتتبعه. يعنى هذا النظام بمجموعة من المهام المعرفة من قبل منظمة DARPA، أولها مراقبة سيل من القصص النصية المتصلة لمعرفة الحدود الفاصلة بين كل قصة والاخرى، وتدعى تقطيع القصص، ثانيها مهمتها الاجابة على السؤال: هل تناقش قصتان معطاتان نفس الموضوع او الحدث؟ وتدعى اكتشاف الصلة. ثالثها معنية بمراقبة سيل من القصص لاكتشاف تلك التي تناقش موضوعا معرفا من قبل المستخدم، وتدعى بتتبع الموضوع. رابعها تهتم بالتعرف على القصص التي تناقش احداثا جديدة فور وصولها، وتدعى اكتشاف القصة الاولى. واخرها تدعى اكتشاف الموضوع، وهي مسؤولة عن فصل مجموعة من القصص المختلطة الى مواضيع، بدون اي معرفة مسبقة بهذه المواضيع، اي تجميع القصص التي تناقش موضوعا واحدا في نفس العنقود. نعمل من خلال هذا المشروع على تطبيق المهام الاربع الاخيرة وتقييمها. يتم استلام القصص في الزمن الحقيقي، اجراء معالجة مسبقة عليها (معالجة لغوية وغير ذلك)، ثم يتم تمثيل القصص بشكل اشعة وتوزين كلمات كل قصة، يتم بعدها اختيار مجموعة كلمات لتمثيل القصة. اما تمثيل المواضيع فنختبر اشكالا مختلفة، كالتمثيل الشعاعي او التمثيل بالقصص وغير ذلك. نناقش خلال هذا المشروع ايضاً استخدام معايير مختلفة لتمثيل القصص وقياس تشابهها، ونختبر استخدام عنوان القصة وتاريخها كمميزات بالإضافة الى مجموعة الكلمات. كما ونتحدث عن منهج خاص بنا لتقييس التشابهات بين القصص والتخفيف من تأثير عمليات اختيار العتبات في النظام، ونعرض التحسينات المذهلة التي يبديها هذا المنهج، والتي تمكن من بناء نظام اكتشاف موضوع وتتبعه، دون القلق حول تحديد العتبة اطلاقا، والذي لطالما كان يمثل التحدي الاكبر لهذا النوع من الانظمة. نتحدث عن تطبيقنا لخوارزميات العنقدة الاكثر تطورا في مهمة اكتشاف الموضوع، ونعرض كيفية قيامنا بتعديل مصفوفة التجاذب في خوارزمية العنقدة الطيفية المطروحة واستخدام طريقة تقييس مختلفة تم تكييفها مع حالة نظامنا، والتي ادت الى تحسين اداء العنقدة من 0.89 الى 0.97 مقاسا على F-measure
In this paper we introduce a comparison for some of data mining algorithm for traffic accidents analysis. We start by describing available data for entry by analyzing the structure of statistical reports in Lattakia traffic directorate, and proceed to data mining stage which enables us to smart study of factors that play roles in traffic accident and find its inter-relations and importance for causing traffic accident. That comes after building data warehouse upon the database we built to store the data we gathered. In this research we list a some of models was tested which is a sample of a many cases we checked to have the research results.
The tracking using wireless sensor networks is one of the applications that are experiencing significant growth. Due to considerations of wireless sensor networks in terms of limited energy source, researches continue to improve methods of routing and transforming information to ensure lower power. Therefore, we have in this research improved the routing of target location information within WSN by providing a new algorithm, which takes advantage of the concept of clustering for wireless network sensors, with the addition of the possibility of interaction between field sensors that belong to different clusters, where in other cases, they cannot interact with each other in the traditional case of cluster networks. to get rid of repeating the same information transfer, we depend on the parameter intensity of the received signal from the target in the sensors, which will reflect positively on the network age, and give a more accurate indication of the target site. We have implemented the proposed algorithm and showed the results of using the simulator OPNET which is one of the best simulators in the field of various types of networks.
The main goal of data mining process is to extract information and discover knowledge from huge databases, where the clustering is one of the most important functionalities which can be done in this area. There are many of clustering algorithms an d methods, but determining or estimating the number of clusters which should be extracted from a dataset is one of the most important issues most of these methods encounter it. This research focuses on the problem of estimating number of clusters in the case of agglomerative hierarchical clustering. We present an evaluation of three of the most common methods used in estimating number of clusters.
This paper introduces a new algorithm to solve some problems that data clustering algorithms such as K-Means suffer from. This new algorithm by itself is able to cluster data without the need of other clustering algorithms.
This research presents literature review on using Artificial intelligence and Data Mining techniques in Anti Money Laundering systems. We compare many methodologies used in different research papers with the purpose of shedding some light on real life applications using Artificial intelligence
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا