A-Ward_pb{eta}: Effective hierarchical clustering using the Minkowski metric and a fast k -means initialisation

57 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Renato Cordeiro de Amorim

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Renato Cordeiro de Amorim - Vladimir Makarenkov - Boris Mirkin

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper we make two novel contributions to hierarchical clustering. First, we introduce an anomalous pattern initialisation method for hierarchical clustering algorithms, called A-Ward, capable of substantially reducing the time they take to converge. This method generates an initial partition with a sufficiently large number of clusters. This allows the cluster merging process to start from this partition rather than from a trivial partition composed solely of singletons. Our second contribution is an extension of the Ward and Ward p algorithms to the situation where the feature weight exponent can differ from the exponent of the Minkowski distance. This new method, called A-Ward pb{eta} , is able to generate a much wider variety of clustering solutions. We also demonstrate that its parameters can be estimated reasonably well by using a cluster validity index. We perform numerous experiments using data sets with two types of noise, insertion of noise features and blurring within-cluster values of some features. These experiments allow us to conclude: (i) our anomalous pattern initialisation method does indeed reduce the time a hierarchical clustering algorithm takes to complete, without negatively impacting its cluster recovery ability; (ii) A-Ward pb{eta} provides better cluster recovery than both Ward and Ward p.

قيم البحث

79 - Luc Giffon , Valentin Emiya , Liva Ralaivola 2019

K-means -- and the celebrated Lloyd algorithm -- is more than the clustering method it was originally designed to be. It has indeed proven pivotal to help increase the speed of many machine learning and data analysis techniques such as indexing, near est-neighbor search and prediction, data compression; its beneficial use has been shown to carry over to the acceleration of kernel machines (when using the Nystrom method). Here, we propose a fast extension of K-means, dubbed QuicK-means, that rests on the idea of expressing the matrix of the $K$ centroids as a product of sparse matrices, a feat made possible by recent results devoted to find approximations of matrices as a product of sparse factors. Using such a decomposition squashes the complexity of the matrix-vector product between the factorized $K times D$ centroid matrix $mathbf{U}$ and any vector from $mathcal{O}(K D)$ to $mathcal{O}(A log A+B)$, with $A=min (K, D)$ and $B=max (K, D)$, where $D$ is the dimension of the training data. This drastic computational saving has a direct impact in the assignment process of a point to a cluster, meaning that it is not only tangible at prediction time, but also at training time, provided the factorization procedure is performed during Lloyds algorithm. We precisely show that resorting to a factorization step at each iteration does not impair the convergence of the optimization scheme and that, depending on the context, it may entail a reduction of the training time. Finally, we provide discussions and numerical simulations that show the versatility of our computationally-efficient QuicK-means algorithm.

التعلم الآلي التعلم الالي

A Stochastic Alternating Balance $k$-Means Algorithm for Fair Clustering

143 - Suyun Liu , Luis Nunes Vicente 2021

In the application of data clustering to human-centric decision-making systems, such as loan applications and advertisement recommendations, the clustering outcome might discriminate against people across different demographic groups, leading to unfa irness. A natural conflict occurs between the cost of clustering (in terms of distance to cluster centers) and the balance representation of all demographic groups across the clusters, leading to a bi-objective optimization problem that is nonconvex and nonsmooth. To determine the complete trade-off between these two competing goals, we design a novel stochastic alternating balance fair $k$-means (SAfairKM) algorithm, which consists of alternating classical mini-batch $k$-means updates and group swap updates. The number of $k$-means updates and the number of swap updates essentially parameterize the weight put on optimizing each objective function. Our numerical experiments show that the proposed SAfairKM algorithm is robust and computationally efficient in constructing well-spread and high-quality Pareto fronts both on synthetic and real datasets. Moreover, we propose a novel companion algorithm, the stochastic alternating bi-objective gradient descent (SA2GD) algorithm, which can handle a smooth version of the considered bi-objective fair $k$-means problem, more amenable for analysis. A sublinear convergence rate of $mathcal{O}(1/T)$ is established under strong convexity for the determination of a stationary point of a weighted sum of the two functions parameterized by the number of steps or updates on each function.

التعلم الآلي التعلم الالي

$t$-$k$-means: A Robust and Stable $k$-means Variant

364 - Yiming Li , Yang Zhang , Qingtao Tang 2019

$k$-means algorithm is one of the most classical clustering methods, which has been widely and successfully used in signal processing. However, due to the thin-tailed property of the Gaussian distribution, $k$-means algorithm suffers from relatively poor performance on the dataset containing heavy-tailed data or outliers. Besides, standard $k$-means algorithm also has relatively weak stability, $i.e.$ its results have a large variance, which reduces its credibility. In this paper, we propose a robust and stable $k$-means variant, dubbed the $t$-$k$-means, as well as its fast version to alleviate those problems. Theoretically, we derive the $t$-$k$-means and analyze its robustness and stability from the aspect of the loss function and the expression of the clustering center, respectively. Extensive experiments are also conducted, which verify the effectiveness and efficiency of the proposed method. The code for reproducing main results is available at url{https://github.com/THUYimingLi/t-k-means}.

التعلم الآلي التعلم الالي

Deep clustering with concrete k-means

125 - Boyan Gao , Yongxin Yang , Henry Gouk 2019

We address the problem of simultaneously learning a k-means clustering and deep feature representation from unlabelled data, which is of interest due to the potential of deep k-means to outperform traditional two-step feature extraction and shallow-c lustering strategies. We achieve this by developing a gradient-estimator for the non-differentiable k-means objective via the Gumbel-Softmax reparameterisation trick. In contrast to previous attempts at deep clustering, our concrete k-means model can be optimised with respect to the canonical k-means objective and is easily trained end-to-end without resorting to alternating optimisation. We demonstrate the efficacy of our method on standard clustering benchmarks.

التعلم الآلي التعلم الالي

Fast K-Means Clustering with Anderson Acceleration

181 - Juyong Zhang , Yuxin Yao , Yue Peng 2018

We propose a novel method to accelerate Lloyds algorithm for K-Means clustering. Unlike previous acceleration approaches that reduce computational cost per iterations or improve initialization, our approach is focused on reducing the number of iterat ions required for convergence. This is achieved by treating the assignment step and the update step of Lloyds algorithm as a fixed-point iteration, and applying Anderson acceleration, a well-established technique for accelerating fixed-point solvers. Classical Anderson acceleration utilizes m previous iterates to find an accelerated iterate, and its performance on K-Means clustering can be sensitive to choice of m and the distribution of samples. We propose a new strategy to dynamically adjust the value of m, which achieves robust and consistent speedups across different problem instances. Our method complements existing acceleration techniques, and can be combined with them to achieve state-of-the-art performance. We perform extensive experiments to evaluate the performance of the proposed method, where it outperforms other algorithms in 106 out of 120 test cases, and the mean decrease ratio of computational time is more than 33%.

التعلم الآلي التحليل العددي التعلم الالي