ﻻ يوجد ملخص باللغة العربية
Archetypal analysis is an unsupervised learning method for exploratory data analysis. One major challenge that limits the applicability of archetypal analysis in practice is the inherent computational complexity of the existing algorithms. In this paper, we provide a novel approximation approach to partially address this issue. Utilizing probabilistic ideas from high-dimensional geometry, we introduce two preprocessing techniques to reduce the dimension and representation cardinality of the data, respectively. We prove that, provided the data is approximately embedded in a low-dimensional linear subspace and the convex hull of the corresponding representations is well approximated by a polytope with a few vertices, our method can effectively reduce the scaling of archetypal analysis. Moreover, the solution of the reduced problem is near-optimal in terms of prediction errors. Our approach can be combined with other acceleration techniques to further mitigate the intrinsic complexity of archetypal analysis. We demonstrate the usefulness of our results by applying our method to summarize several moderately large-scale datasets.
In this work we introduce a reduced-rank algorithm for Gaussian process regression. Our numerical scheme converts a Gaussian process on a user-specified interval to its Karhunen-Lo`eve expansion, the $L^2$-optimal reduced-rank representation. Numeric
This article is the rejoinder for the paper Probabilistic Integration: A Role in Statistical Computation? to appear in Statistical Science with discussion. We would first like to thank the reviewers and many of our colleagues who helped shape this pa
Probabilistic Latent Tensor Factorization (PLTF) is a recently proposed probabilistic framework for modelling multi-way data. Not only the common tensor factorization models but also any arbitrary tensor factorization structure can be realized by the
In this article, we consider computing expectations w.r.t. probability measures which are subject to discretization error. Examples include partially observed diffusion processes or inverse problems, where one may have to discretize time and/or space
This position paper summarizes a recently developed research program focused on inference in the context of data centric science and engineering applications, and forecasts its trajectory forward over the next decade. Often one endeavours in this con