تصنف خوارزمية K-Means الكائنات إلى عدد محدد مسبقا من العناقيد و هو K عنقود. و تتم عملية اختيار المراكز العنقودية في هذه الخوارزمية بشكل العشوائية، و يفضل أن تكون هذه المراكز بعيدة عن بعضها البعض قدر الإمكان. تؤثر نقطة البدء العشوائية على فعالية عملية التجميع و النتائج. و تعتمد عملية المقاربة المعنقدة على قيم المراكز الأولية بشكل رئيسي.
نركّز في هذا البحث على طريقة اختيار مركز العنقود لتحسين أداء العنقدة في الخوارزمية K-Means كما نستخدم مراكز العناقيد الأولية و التي حصلنا عليها من البيانات المقسّمة على طول محور البيانات وفقا لأعلى فرق لتعيين مركز العنقود الأفضل.
The algorithm classifies objects to a predefined number of clusters, which is given by the user (assume k clusters). The idea is to choose random cluster centers, one for each cluster. These centers are preferred to be as far as possible from each other. Starting points affect the clustering process and results. Here the Centroid initialization plays an important role in determining the cluster assignment in effective way. Also, the convergence behavior of clustering is based on the initial centroid values assigned. This research focuses on the assignment of cluster centroid selection so as to improve the clustering performance by K-Means clustering algorithm. This research uses Initial Cluster Centers Derived from Data Partitioning along the Data Axis with the Highest Variance to assign for cluster centroid.
References used
Dunham, M. H. 2003-Data Mining: Introductory and Advanced Topics. Prentice Hal Bazsalica, 328p
Hand,D. Mannila,H. Smyth,R. 2001- Principles of Data Mining, MIT Press, London, 285p. Algorithms,Indian,221p
Kaufman,L. Rousseeuw,P2010-Finding Groups in Data: an Introduction to Cluster Analysis. John,170p
Ng,R, Han.J-2008-Efficient and Effective Clustering Methods for Spatial Data Mining, Conf, 144p
Shi Yong, Zhang. Ge. 2011-Research on an improved algorithm for cluster analysis, International Conference on Consumer Electronics, Communications and Networks (CECNet), 601p
This paper introduces a new algorithm to solve some problems
that data clustering algorithms such as K-Means suffer from.
This new algorithm by itself is able to cluster data without the
need of other clustering algorithms.
With the tremendous development in all areas of scientific,
economic, political and other appeared the need to find nontraditional ways in which to deal with all the data patterns (text, video and audio, etc.), which are becoming very large volumes
In this paper, we introduce a modification to fuzzy mountain
data clustering algorithm. We were able to make this algorithm
working automatically, through finding a way to divide the
space, to determine the values of the input parameters, and
the stop condition automatically, instead of getting them by the
user.
Following the success of dot-product attention in Transformers, numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length. While these variants are memory and compute efficient, it is not
Jujeop is a type of pun and a unique way for fans to express their love for the K-pop stars they follow using Korean. One of the unique characteristics of Jujeop is its use of exaggerated expressions to compliment K-pop stars, which contain or lead t