ﻻ يوجد ملخص باللغة العربية
CFSFDP (clustering by fast search and find of density peaks) is recently developed density-based clustering algorithm. Compared to DBSCAN, it needs less parameters and is computationally cheap for its non-iteration. Alex. at al have demonstrated its power by many applications. However, CFSFDP performs not well when there are more than one density peak for one cluster, what we name as no density peaks. In this paper, inspired by the idea of a hierarchical clustering algorithm CHAMELEON, we propose an extension of CFSFDP,E_CFSFDP, to adapt more applications. In particular, we take use of original CFSFDP to generating initial clusters first, then merge the sub clusters in the second phase. We have conducted the algorithm to several data sets, of which, there are no density peaks. Experiment results show that our approach outperforms the original one due to it breaks through the strict claim of data sets.
In this paper we revisit the kernel density estimation problem: given a kernel $K(x, y)$ and a dataset of $n$ points in high dimensional Euclidean space, prepare a data structure that can quickly output, given a query $q$, a $(1+epsilon)$-approximati
As one type of efficient unsupervised learning methods, clustering algorithms have been widely used in data mining and knowledge discovery with noticeable advantages. However, clustering algorithms based on density peak have limited clustering effect
Measuring graph clustering quality remains an open problem. To address it, we introduce quality measures based on comparisons of intra- and inter-cluster densities, an accompanying statistical test of the significance of their differences and a step-
Time Projection Chambers (TPCs) working in combination with Gas Electron Multipliers (GEMs) produce a very sensitive detector capable of observing low energy events. This is achieved by capturing photons generated during the GEM electron multiplicati
This paper revisits the problem of computing empirical cumulative distribution functions (ECDF) efficiently on large, multivariate datasets. Computing an ECDF at one evaluation point requires $mathcal{O}(N)$ operations on a dataset composed of $N$ da