No Arabic abstract
This is a survey of the method of graph cuts and its applications to graph clustering of weighted unsigned and signed graphs. I provide a fairly thorough treatment of the method of normalized graph cuts, a deeply original method due to Shi and Malik, including complete proofs. The main thrust of this paper is the method of normalized cuts. I give a detailed account for K = 2 clusters, and also for K > 2 clusters, based on the work of Yu and Shi. I also show how both graph drawing and normalized cut K-clustering can be easily generalized to handle signed graphs, which are weighted graphs in which the weight matrix W may have negative coefficients. Intuitively, negative coefficients indicate distance or dissimilarity. The solution is to replace the degree matrix by the matrix in which absolute values of the weights are used, and to replace the Laplacian by the Laplacian with the new degree matrix of absolute values. As far as I know, the generalization of K-way normalized clustering to signed graphs is new. Finally, I show how the method of ratio cuts, in which a cut is normalized by the size of the cluster rather than its volume, is just a special case of normalized cuts.
We introduce a family of multi-way Cheeger-type constants ${h_k^{sigma}, k=1,2,ldots, n}$ on a signed graph $Gamma=(G,sigma)$ such that $h_k^{sigma}=0$ if and only if $Gamma$ has $k$ balanced connected components. These constants are switching invariant and bring together in a unified viewpoint a number of important graph-theoretical concepts, including the classical Cheeger constant, those measures of bipartiteness introduced by Desai-Rao, Trevisan, Bauer-Jost, respectively, on unsigned graphs,, and the frustration index (originally called the line index of balance by Harary) on signed graphs. We further unify the (higher-order or improved) Cheeger and dual Cheeger inequalities for unsigned graphs as well as the underlying algorithmic proof techniques by establishing their correspondi
Clustering is an important topic in algorithms, and has a number of applications in machine learning, computer vision, statistics, and several other research disciplines. Traditional objectives of graph clustering are to find clusters with low conductance. Not only are these objectives just applicable for undirected graphs, they are also incapable to take the relationships between clusters into account, which could be crucial for many applications. To overcome these downsides, we study directed graphs (digraphs) whose clusters exhibit further structural information amongst each other. Based on the Hermitian matrix representation of digraphs, we present a nearly-linear time algorithm for digraph clustering, and further show that our proposed algorithm can be implemented in sublinear time under reasonable assumptions. The significance of our theoretical work is demonstrated by extensive experimental results on the UN Comtrade Dataset: the output clustering of our algorithm exhibits not only how the clusters (sets of countries) relate to each other with respect to their import and export records, but also how these clusters evolve over time, in accordance with known facts in international trade.
These are notes on the method of normalized graph cuts and its applications to graph clustering. I provide a fairly thorough treatment of this deeply original method due to Shi and Malik, including complete proofs. I include the necessary background on graphs and graph Laplacians. I then explain in detail how the eigenvectors of the graph Laplacian can be used to draw a graph. This is an attractive application of graph Laplacians. The main thrust of this paper is the method of normalized cuts. I give a detailed account for K = 2 clusters, and also for K > 2 clusters, based on the work of Yu and Shi. Three points that do not appear to have been clearly articulated before are elaborated: 1. The solutions of the main optimization problem should be viewed as tuples in the K-fold cartesian product of projective space RP^{N-1}. 2. When K > 2, the solutions of the relaxed problem should be viewed as elements of the Grassmannian G(K,N). 3. Two possible Riemannian distances are available to compare the closeness of solutions: (a) The distance on (RP^{N-1})^K. (b) The distance on the Grassmannian. I also clarify what should be the necessary and sufficient conditions for a matrix to represent a partition of the vertices of a graph to be clustered.
Signed graphs are graphs whose edges get a sign $+1$ or $-1$ (the signature). Signed graphs can be studied by means of graph matrices extended to signed graphs in a natural way. Recently, the spectra of signed graphs have attracted much attention from graph spectra specialists. One motivation is that the spectral theory of signed graphs elegantly generalizes the spectral theories of unsigned graphs. On the other hand, unsigned graphs do not disappear completely, since their role can be taken by the special case of balanced signed graphs. Therefore, spectral problems defined and studied for unsigned graphs can be considered in terms of signed graphs, and sometimes such generalization shows nice properties which cannot be appreciated in terms of (unsigned) graphs. Here, we survey some general results on the adjacency spectra of signed graphs, and we consider some spectral problems which are inspired from the spectral theory of (unsigned) graphs.
The present paper is devoted to clustering geometric graphs. While the standard spectral clustering is often not effective for geometric graphs, we present an effective generalization, which we call higher-order spectral clustering. It resembles in concept the classical spectral clustering method but uses for partitioning the eigenvector associated with a higher-order eigenvalue. We establish the weak consistency of this algorithm for a wide class of geometric graphs which we call Soft Geometric Block Model. A small adjustment of the algorithm provides strong consistency. We also show that our method is effective in numerical experiments even for graphs of modest size.