No Arabic abstract
Subspace clustering has been extensively studied from the hypothesis-and-test, algebraic, and spectral clustering based perspectives. Most assume that only a single type/class of subspace is present. Generalizations to multiple types are non-trivial, plagued by challenges such as choice of types and numbers of models, sampling imbalance and parameter tuning. In this work, we formulate the multi-type subspace clustering problem as one of learning non-linear subspace filters via deep multi-layer perceptrons (mlps). The response to the learnt subspace filters serve as the feature embedding that is clustering-friendly, i.e., points of the same clusters will be embedded closer together through the network. For inference, we apply K-means to the network output to cluster the data. Experiments are carried out on both synthetic and real world multi-type fitting problems, producing state-of-the-art results.
In recent years, multi-view subspace clustering has achieved impressive performance due to the exploitation of complementary imformation across multiple views. However, multi-view data can be very complicated and are not easy to cluster in real-world applications. Most existing methods operate on raw data and may not obtain the optimal solution. In this work, we propose a novel multi-view clustering method named smoothed multi-view subspace clustering (SMVSC) by employing a novel technique, i.e., graph filtering, to obtain a smooth representation for each view, in which similar data points have similar feature values. Specifically, it retains the graph geometric features through applying a low-pass filter. Consequently, it produces a ``clustering-friendly representation and greatly facilitates the downstream clustering task. Extensive experiments on benchmark datasets validate the superiority of our approach. Analysis shows that graph filtering increases the separability of classes.
Hyperspectral image (HSI) clustering is a challenging task due to the high complexity of HSI data. Subspace clustering has been proven to be powerful for exploiting the intrinsic relationship between data points. Despite the impressive performance in the HSI clustering, traditional subspace clustering methods often ignore the inherent structural information among data. In this paper, we revisit the subspace clustering with graph convolution and present a novel subspace clustering framework called Graph Convolutional Subspace Clustering (GCSC) for robust HSI clustering. Specifically, the framework recasts the self-expressiveness property of the data into the non-Euclidean domain, which results in a more robust graph embedding dictionary. We show that traditional subspace clustering models are the special forms of our framework with the Euclidean data. Basing on the framework, we further propose two novel subspace clustering models by using the Frobenius norm, namely Efficient GCSC (EGCSC) and Efficient Kernel GCSC (EKGCSC). Both models have a globally optimal closed-form solution, which makes them easier to implement, train, and apply in practice. Extensive experiments on three popular HSI datasets demonstrate that EGCSC and EKGCSC can achieve state-of-the-art clustering performance and dramatically outperforms many existing methods with significant margins.
Deep multi-view clustering methods have achieved remarkable performance. However, all of them failed to consider the difficulty labels (uncertainty of ground-truth for training samples) over multi-view samples, which may result into a nonideal clustering network for getting stuck into poor local optima during training process; worse still, the difficulty labels from multi-view samples are always inconsistent, such fact makes it even more challenging to handle. In this paper, we propose a novel Deep Adversarial Inconsistent Cognitive Sampling (DAICS) method for multi-view progressive subspace clustering. A multiview binary classification (easy or difficult) loss and a feature similarity loss are proposed to jointly learn a binary classifier and a deep consistent feature embedding network, throughout an adversarial minimax game over difficulty labels of multiview consistent samples. We develop a multi-view cognitive sampling strategy to select the input samples from easy to difficult for multi-view clustering network training. However, the distributions of easy and difficult samples are mixed together, hence not trivial to achieve the goal. To resolve it, we define a sampling probability with theoretical guarantee. Based on that, a golden section mechanism is further designed to generate a sample set boundary to progressively select the samples with varied difficulty labels via a gate unit, which is utilized to jointly learn a multi-view common progressive subspace and clustering network for more efficient clustering. Experimental results on four real-world datasets demonstrate the superiority of DAICS over the state-of-the-art methods.
Deep Subspace Clustering Networks (DSC) provide an efficient solution to the problem of unsupervised subspace clustering by using an undercomplete deep auto-encoder with a fully-connected layer to exploit the self expressiveness property. This method uses undercomplete representations of the input data which makes it not so robust and more dependent on pre-training. To overcome this, we propose a simple yet efficient alternative method - Overcomplete Deep Subspace Clustering Networks (ODSC) where we use overcomplete representations for subspace clustering. In our proposed method, we fuse the features from both undercomplete and overcomplete auto-encoder networks before passing them through the self-expressive layer thus enabling us to extract a more meaningful and robust representation of the input data for clustering. Experimental results on four benchmark datasets show the effectiveness of the proposed method over DSC and other clustering methods in terms of clustering error. Our method is also not as dependent as DSC is on where pre-training should be stopped to get the best performance and is also more robust to noise. Code - href{https://github.com/jeya-maria-jose/Overcomplete-Deep-Subspace-Clustering}{https://github.com/jeya-maria-jose/Overcomplete-Deep-Subspace-Clustering
Auto-Encoder (AE)-based deep subspace clustering (DSC) methods have achieved impressive performance due to the powerful representation extracted using deep neural networks while prioritizing categorical separability. However, self-reconstruction loss of an AE ignores rich useful relation information and might lead to indiscriminative representation, which inevitably degrades the clustering performance. It is also challenging to learn high-level similarity without feeding semantic labels. Another unsolved problem facing DSC is the huge memory cost due to $ntimes n$ similarity matrix, which is incurred by the self-expression layer between an encoder and decoder. To tackle these problems, we use pairwise similarity to weigh the reconstruction loss to capture local structure information, while a similarity is learned by the self-expression layer. Pseudo-graphs and pseudo-labels, which allow benefiting from uncertain knowledge acquired during network training, are further employed to supervise similarity learning. Joint learning and iterative training facilitate to obtain an overall optimal solution. Extensive experiments on benchmark datasets demonstrate the superiority of our approach. By combining with the $k$-nearest neighbors algorithm, we further show that our method can address the large-scale and out-of-sample problems.