ترغب بنشر مسار تعليمي؟ اضغط هنا

Algorithm-Agnostic Explainability for Unsupervised Clustering

55   0   0.0 ( 0 )
 نشر من قبل Charles Ellis
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Supervised machine learning explainability has developed rapidly in recent years. However, clustering explainability has lagged behind. Here, we demonstrate the first adaptation of model-agnostic explainability methods to explain unsupervised clustering. We present two novel algorithm-agnostic explainability methods - global permutation percent change (G2PC) and local perturbation percent change (L2PC) - that identify feature importance globally to a clustering algorithm and locally to the clustering of individual samples. The methods are (1) easy to implement and (2) broadly applicable across clustering algorithms, which could make them highly impactful. We demonstrate the utility of the methods for explaining five popular clustering methods on low-dimensional synthetic datasets and on high-dimensional functional network connectivity data extracted from a resting-state functional magnetic resonance imaging dataset of 151 individuals with schizophrenia and 160 controls. Our results are consistent with existing literature while also shedding new light on how changes in brain connectivity may lead to schizophrenia symptoms. We further compare the explanations from our methods to an interpretable classifier and find them to be highly similar. Our proposed methods robustly explain multiple clustering algorithms and could facilitate new insights into many applications. We hope this study will greatly accelerate the development of the field of clustering explainability.



قيم البحث

اقرأ أيضاً

What makes two images similar? We propose new approaches to generate model-agnostic explanations for image similarity, search, and retrieval. In particular, we extend Class Activation Maps (CAMs), Additive Shapley Explanations (SHAP), and Locally Int erpretable Model-Agnostic Explanations (LIME) to the domain of image retrieval and search. These approaches enable black and grey-box model introspection and can help diagnose errors and understand the rationale behind a models similarity judgments. Furthermore, we extend these approaches to extract a full pairwise correspondence between the query and retrieved image pixels, an approach we call joint interpretations. Formally, we show joint search interpretations arise from projecting Harsanyi dividends, and that this approach generalizes Shapley Values and The Shapley-Taylor indices. We introduce a fast kernel-based method for estimating Shapley-Taylor indices and empirically show that these game-theoretic measures yield more consistent explanations for image similarity architectures.
We develop a new family of convex relaxations for $k$-means clustering based on sum-of-squares norms, a relaxation of the injective tensor norm that is efficiently computable using the Sum-of-Squares algorithm. We give an algorithm based on this rela xation that recovers a faithful approximation to the true means in the given data whenever the low-degree moments of the points in each cluster have bounded sum-of-squares norms. We then prove a sharp upper bound on the sum-of-squares norms for moment tensors of any distribution that satisfies the emph{Poincare inequality}. The Poincare inequality is a central inequality in probability theory, and a large class of distributions satisfy it including Gaussians, product distributions, strongly log-concave distributions, and any sum or uniformly continuous transformation of such distributions. As an immediate corollary, for any $gamma > 0$, we obtain an efficient algorithm for learning the means of a mixture of $k$ arbitrary Poincare distributions in $mathbb{R}^d$ in time $d^{O(1/gamma)}$ so long as the means have separation $Omega(k^{gamma})$. This in particular yields an algorithm for learning Gaussian mixtures with separation $Omega(k^{gamma})$, thus partially resolving an open problem of Regev and Vijayaraghavan citet{regev2017learning}. Our algorithm works even in the outlier-robust setting where an $epsilon$ fraction of arbitrary outliers are added to the data, as long as the fraction of outliers is smaller than the smallest cluster. We, therefore, obtain results in the strong agnostic setting where, in addition to not knowing the distribution family, the data itself may be arbitrarily corrupted.
In this paper, we propose an unsupervised collaborative representation deep network (UCRDNet) which consists of novel collaborative representation RBM (crRBM) and collaborative representation GRBM (crGRBM). The UCRDNet is a novel deep collaborative f eature extractor for exploring more sophisticated probabilistic models of real-valued and binary data. Unlike traditional representation methods, one similarity relation between the input instances and another similarity relation between the features of the input instances are collaboratively fused together in the representation process of the crGRBM and crRBM models. Here, we use the Locality Sensitive Hashing (LSH) method to divide the input instance matrix into many mini blocks which contain similar instance and local features. Then, we expect the hidden layer feature units of each block gather to block center as much as possible in the training processes of the crRBM and crGRBM. Hence, the correlations between the instances and features as collaborative relations are fused in the hidden layer features. In the experiments, we use K-means and Spectral Clustering (SC) algorithms based on four contrast deep networks to verify the deep collaborative representation capability of the UCRDNet architecture. One architecture of the UCRDNet is composed with a crGRBM and two crRBMs for modeling real-valued data and another architecture of it is composed with three crRBMs for modeling binary data. The experimental results show that the proposed UCRDNet has more outstanding performance than the Autoencoder and DeepFS deep networks (without collaborative representation strategy) for unsupervised clustering on the MSRA-MM2.0 and UCI datasets. Furthermore, the proposed UCRDNet shows more excellent collaborative representation capabilities than the CDL deep collaborative networks for unsupervised clustering.
This paper presents the intrinsic limit determination algorithm (ILD Algorithm), a novel technique to determine the best possible performance, measured in terms of the AUC (area under the ROC curve) and accuracy, that can be obtained from a specific dataset in a binary classification problem with categorical features {sl regardless} of the model used. This limit, namely the Bayes error, is completely independent of any model used and describes an intrinsic property of the dataset. The ILD algorithm thus provides important information regarding the prediction limits of any binary classification algorithm when applied to the considered dataset. In this paper the algorithm is described in detail, its entire mathematical framework is presented and the pseudocode is given to facilitate its implementation. Finally, an example with a real dataset is given.
In this paper, we study Contextual Unsupervised Sequential Selection (USS), a new variant of the stochastic contextual bandits problem where the loss of an arm cannot be inferred from the observed feedback. In our setup, arms are associated with fixe d costs and are ordered, forming a cascade. In each round, a context is presented, and the learner selects the arms sequentially till some depth. The total cost incurred by stopping at an arm is the sum of fixed costs of arms selected and the stochastic loss associated with the arm. The learners goal is to learn a decision rule that maps contexts to arms with the goal of minimizing the total expected loss. The problem is challenging as we are faced with an unsupervised setting as the total loss cannot be estimated. Clearly, learning is feasible only if the optimal arm can be inferred (explicitly or implicitly) from the problem structure. We observe that learning is still possible when the problem instance satisfies the so-called Contextual Weak Dominance (CWD) property. Under CWD, we propose an algorithm for the contextual USS problem and demonstrate that it has sub-linear regret. Experiments on synthetic and real datasets validate our algorithm.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا