ﻻ يوجد ملخص باللغة العربية
This work draws inspiration from three important sources of research on dissimilarity-based clustering and intertwines those three threads into a consistent principled functorial theory of clustering. Those three are the overlapping clustering of Jardine and Sibson, the functorial approach of Carlsson and M{e}moli to partition-based clustering, and the Isbell/Dress schools study of injective envelopes. Carlsson and M{e}moli introduce the idea of viewing clustering methods as functors from a category of metric spaces to a category of clusters, with functoriality subsuming many desirable properties. Our first series of results extends their theory of functorial clustering schemes to methods that allow overlapping clusters in the spirit of Jardine and Sibson. This obviates some of the unpleasant effects of chaining that occur, for example with single-linkage clustering. We prove an equivalence between these general overlapping clustering functors and projections of weight spaces to what we term clustering domains, by focusing on the order structure determined by the morphisms. As a specific application of this machinery, we are able to prove that there are no functorial projections to cut metrics, or even to tree metrics. Finally, although we focus less on the construction of clustering methods (clustering domains) derived from injective envelopes, we lay out some preliminary results, that hopefully will give a feel for how the third leg of the stool comes into play.
Hierarchical clustering is a widely used approach for clustering datasets at multiple levels of granularity. Despite its popularity, existing algorithms such as hierarchical agglomerative clustering (HAC) are limited to the offline setting, and thus
Hierarchical clustering is an important technique to organize big data for exploratory data analysis. However, existing one-size-fits-all hierarchical clustering methods often fail to meet the diverse needs of different users. To address this challen
We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect recon
We present a hierarchical maximum-margin clustering method for unsupervised data analysis. Our method extends beyond flat maximum-margin clustering, and performs clustering recursively in a top-down manner. We propose an effective greedy splitting cr
Bottom-up algorithms such as the classic hierarchical agglomerative clustering, are highly effective for hierarchical as well as flat clustering. However, the large number of rounds and their sequential nature limit the scalability of agglomerative c