No Arabic abstract
Archetypal analysis is an unsupervised learning method for exploratory data analysis. One major challenge that limits the applicability of archetypal analysis in practice is the inherent computational complexity of the existing algorithms. In this paper, we provide a novel approximation approach to partially address this issue. Utilizing probabilistic ideas from high-dimensional geometry, we introduce two preprocessing techniques to reduce the dimension and representation cardinality of the data, respectively. We prove that, provided the data is approximately embedded in a low-dimensional linear subspace and the convex hull of the corresponding representations is well approximated by a polytope with a few vertices, our method can effectively reduce the scaling of archetypal analysis. Moreover, the solution of the reduced problem is near-optimal in terms of prediction errors. Our approach can be combined with other acceleration techniques to further mitigate the intrinsic complexity of archetypal analysis. We demonstrate the usefulness of our results by applying our method to summarize several moderately large-scale datasets.
In this work we introduce a reduced-rank algorithm for Gaussian process regression. Our numerical scheme converts a Gaussian process on a user-specified interval to its Karhunen-Lo`eve expansion, the $L^2$-optimal reduced-rank representation. Numerical evaluation of the Karhunen-Lo`eve expansion is performed once during precomputation and involves computing a numerical eigendecomposition of an integral operator whose kernel is the covariance function of the Gaussian process. The Karhunen-Lo`eve expansion is independent of observed data and depends only on the covariance kernel and the size of the interval on which the Gaussian process is defined. The scheme of this paper does not require translation invariance of the covariance kernel. We also introduce a class of fast algorithms for Bayesian fitting of hyperparameters, and demonstrate the performance of our algorithms with numerical experiments in one and two dimensions. Extensions to higher dimensions are mathematically straightforward but suffer from the standard curses of high dimensions.
This article is the rejoinder for the paper Probabilistic Integration: A Role in Statistical Computation? to appear in Statistical Science with discussion. We would first like to thank the reviewers and many of our colleagues who helped shape this paper, the editor for selecting our paper for discussion, and of course all of the discussants for their thoughtful, insightful and constructive comments. In this rejoinder, we respond to some of the points raised by the discussants and comment further on the fundamental questions underlying the paper: (i) Should Bayesian ideas be used in numerical analysis?, and (ii) If so, what role should such approaches have in statistical computation?
Probabilistic Latent Tensor Factorization (PLTF) is a recently proposed probabilistic framework for modelling multi-way data. Not only the common tensor factorization models but also any arbitrary tensor factorization structure can be realized by the PLTF framework. This paper presents full Bayesian inference via variational Bayes that facilitates more powerful modelling and allows more sophisticated inference on the PLTF framework. We illustrate our approach on model order selection and link prediction.
In this article, we consider computing expectations w.r.t. probability measures which are subject to discretization error. Examples include partially observed diffusion processes or inverse problems, where one may have to discretize time and/or space, in order to practically work with the probability of interest. Given access only to these discretizations, we consider the construction of unbiased Monte Carlo estimators of expectations w.r.t. such target probability distributions. It is shown how to obtain such estimators using a novel adaptation of randomization schemes and Markov simulation methods. Under appropriate assumptions, these estimators possess finite variance and finite expected cost. There are two important consequences of this approach: (i) unbiased inference is achieved at the canonical complexity rate, and (ii) the resulting estimators can be generated independently, thereby allowing strong scaling to arbitrarily many parallel processors. Several algorithms are presented, and applied to some examples of Bayesian inference problems, with both simulated and real observed data.
This position paper summarizes a recently developed research program focused on inference in the context of data centric science and engineering applications, and forecasts its trajectory forward over the next decade. Often one endeavours in this context to learn complex systems in order to make more informed predictions and high stakes decisions under uncertainty. Some key challenges which must be met in this context are robustness, generalizability, and interpretability. The Bayesian framework addresses these three challenges, while bringing with it a fourth, undesirable feature: it is typically far more expensive than its deterministic counterparts. In the 21st century, and increasingly over the past decade, a growing number of methods have emerged which allow one to leverage cheap low-fidelity models in order to precondition algorithms for performing inference with more expensive models and make Bayesian inference tractable in the context of high-dimensional and expensive models. Notable examples are multilevel Monte Carlo (MLMC), multi-index Monte Carlo (MIMC), and their randomized counterparts (rMLMC), which are able to provably achieve a dimension-independent (including $infty-$dimension) canonical complexity rate with respect to mean squared error (MSE) of $1/$MSE. Some parallelizability is typically lost in an inference context, but recently this has been largely recovered via novel double randomization approaches. Such an approach delivers i.i.d. samples of quantities of interest which are unbiased with respect to the infinite resolution target distribution. Over the coming decade, this family of algorithms has the potential to transform data centric science and engineering, as well as classical machine learning applications such as deep learning, by scaling up and scaling out fully Bayesian inference.