No Arabic abstract
A good clustering can help a data analyst to explore and understand a data set, but what constitutes a good clustering may depend on domain-specific and application-specific criteria. These criteria can be difficult to formalize, even when it is easy for an analyst to know a good clustering when she sees one. We present a new approach to interactive clustering for data exploration, called ciif, based on a particularly simple feedback mechanism, in which an analyst can choose to reject individual clusters and request new ones. The new clusters should be different from previously rejected clusters while still fitting the data well. We formalize this interaction in a novel Bayesian prior elicitation framework. In each iteration, the prior is adapted to account for all the previous feedback, and a new clustering is then produced from the posterior distribution. To achieve the computational efficiency necessary for an interactive setting, we propose an incremental optimization method over data minibatches using Lagrangian relaxation. Experiments demonstrate that ciif can produce accurate and diverse clusterings.
A good clustering can help a data analyst to explore and understand a data set, but what constitutes a good clustering may depend on domain-specific and application-specific criteria. These criteria can be difficult to formalize, even when it is easy for an analyst to know a good clustering when they see one. We present a new approach to interactive clustering for data exploration called TINDER, based on a particularly simple feedback mechanism, in which an analyst can reject a given clustering and request a new one, which is chosen to be different from the previous clustering while fitting the data well. We formalize this interaction in a Bayesian framework as a method for prior elicitation, in which each different clustering is produced by a prior distribution that is modified to discourage previously rejected clusterings. We show that TINDER successfully produces a diverse set of clusterings, each of equivalent quality, that are much more diverse than would be obtained by randomized restarts.
We revisit Rahimi and Recht (2007)s kernel random Fourier features (RFF) method through the lens of the PAC-Bayesian theory. While the primary goal of RFF is to approximate a kernel, we look at the Fourier transform as a prior distribution over trigonometric hypotheses. It naturally suggests learning a posterior on these hypotheses. We derive generalization bounds that are optimized by learning a pseudo-posterior obtained from a closed-form expression. Based on this study, we consider two learning strategies: The first one finds a compact landmarks-based representation of the data where each landmark is given by a distribution-tailored similarity measure, while the second one provides a PAC-Bayesian justification to the kernel alignment method of Sinha and Duchi (2016).
We introduce a density-based clustering method called skeleton clustering that can detect clusters in multivariate and even high-dimensional data with irregular shapes. To bypass the curse of dimensionality, we propose surrogate density measures that are less dependent on the dimension but have intuitive geometric interpretations. The clustering framework constructs a concise representation of the given data as an intermediate step and can be thought of as a combination of prototype methods, density-based clustering, and hierarchical clustering. We show by theoretical analysis and empirical studies that the skeleton clustering leads to reliable clusters in multivariate and high-dimensional scenarios.
We study a recent inferential framework, named posterior regularisation, on the Bayesian hierarchical mixture clustering (BHMC) model. This framework facilitates a simple way to impose extra constraints on a Bayesian model to overcome some weakness of the original model. It narrows the search space of the parameters of the Bayesian model through a formalism that imposes certain constraints on the features of the found solutions. In this paper, in order to enhance the separation of clusters, we apply posterior regularisation to impose max-margin constraints on the nodes at every level of the hierarchy. This paper shows how the framework integrates with BHMC and achieves the expected improvements over the original Bayesian model.
Bayesian neural networks have shown great promise in many applications where calibrated uncertainty estimates are crucial and can often also lead to a higher predictive performance. However, it remains challenging to choose a good prior distribution over their weights. While isotropic Gaussian priors are often chosen in practice due to their simplicity, they do not reflect our true prior beliefs well and can lead to suboptimal performance. Our new library, BNNpriors, enables state-of-the-art Markov Chain Monte Carlo inference on Bayesian neural networks with a wide range of predefined priors, including heavy-tailed ones, hierarchical ones, and mixture priors. Moreover, it follows a modular approach that eases the design and implementation of new custom priors. It has facilitated foundational discoveries on the nature of the cold posterior effect in Bayesian neural networks and will hopefully catalyze future research as well as practical applications in this area.