ترغب بنشر مسار تعليمي؟ اضغط هنا

Rectangular Flows for Manifold Learning

221   0   0.0 ( 0 )
 نشر من قبل Anthony Caterini
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

Normalizing flows are invertible neural networks with tractable change-of-volume terms, which allows optimization of their parameters to be efficiently performed via maximum likelihood. However, data of interest is typically assumed to live in some (often unknown) low-dimensional manifold embedded in high-dimensional ambient space. The result is a modelling mismatch since -- by construction -- the invertibility requirement implies high-dimensional support of the learned distribution. Injective flows, mapping from low- to high-dimensional space, aim to fix this discrepancy by learning distributions on manifolds, but the resulting volume-change term becomes more challenging to evaluate. Current approaches either avoid computing this term entirely using various heuristics, or assume the manifold is known beforehand and therefore are not widely applicable. Instead, we propose two methods to tractably calculate the gradient of this term with respect to the parameters of the model, relying on careful use of automatic differentiation and techniques from numerical linear algebra. Both approaches perform end-to-end nonlinear manifold learning and density estimation for data projected onto this manifold. We study the trade-offs between our proposed methods, empirically verify that we outperform approaches ignoring the volume-change term by more accurately learning manifolds and the corresponding distributions on them, and show promising results on out-of-distribution detection.



قيم البحث

اقرأ أيضاً

We introduce manifold-learning flows (M-flows), a new class of generative models that simultaneously learn the data manifold as well as a tractable probability density on that manifold. Combining aspects of normalizing flows, GANs, autoencoders, and energy-based models, they have the potential to represent datasets with a manifold structure more faithfully and provide handles on dimensionality reduction, denoising, and out-of-distribution detection. We argue why such models should not be trained by maximum likelihood alone and present a new training algorithm that separates manifold and density updates. In a range of experiments we demonstrate how M-flows learn the data manifold and allow for better inference than standard flows in the ambient data space.
Tractably modelling distributions over manifolds has long been an important goal in the natural sciences. Recent work has focused on developing general machine learning models to learn such distributions. However, for many applications these distribu tions must respect manifold symmetries -- a trait which most previous models disregard. In this paper, we lay the theoretical foundations for learning symmetry-invariant distributions on arbitrary manifolds via equivariant manifold flows. We demonstrate the utility of our approach by using it to learn gauge invariant densities over $SU(n)$ in the context of quantum field theory.
Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional reflectometry, computer vision and computer graphics. For many applications, appearance is sufficiently well characterized by the bidirectional reflecta nce distribution function (BRDF). We treat BRDF measurements as samples of points from high-dimensional non-linear non-convex manifolds. BRDF manifolds form an infinite-dimensional space, but typically the available measurements are very scarce for complicated problems such as BRDF estimation. Therefore, an efficient learning strategy is crucial when performing the measurements. In this paper, we build the foundation of a mathematical framework that allows to develop and apply new techniques within statistical design of experiments and generalized proactive learning, in order to establish more efficient sampling and measurement strategies for BRDF data manifolds.
A fundamental task in data exploration is to extract simplified low dimensional representations that capture intrinsic geometry in data, especially for faithfully visualizing data in two or three dimensions. Common approaches to this task use kernel methods for manifold learning. However, these methods typically only provide an embedding of fixed input data and cannot extend to new data points. Autoencoders have also recently become popular for representation learning. But while they naturally compute feature extractors that are both extendable to new data and invertible (i.e., reconstructing original features from latent representation), they have limited capabilities to follow global intrinsic geometry compared to kernel-based manifold learning. We present a new method for integrating both approaches by incorporating a geometric regularization term in the bottleneck of the autoencoder. Our regularization, based on the diffusion potential distances from the recently-proposed PHATE visualization method, encourages the learned latent representation to follow intrinsic data geometry, similar to manifold learning algorithms, while still enabling faithful extension to new data and reconstruction of data in the original feature space from latent coordinates. We compare our approach with leading kernel methods and autoencoder models for manifold learning to provide qualitative and quantitative evidence of our advantages in preserving intrinsic structure, out of sample extension, and reconstruction. Our method is easily implemented for big-data applications, whereas other methods are limited in this regard.
128 - Xiuyuan Cheng , Yao Xie 2021
We present a study of kernel MMD two-sample test statistics in the manifold setting, assuming the high-dimensional observations are close to a low-dimensional manifold. We characterize the property of the test (level and power) in relation to the ker nel bandwidth, the number of samples, and the intrinsic dimensionality of the manifold. Specifically, we show that when data densities are supported on a $d$-dimensional sub-manifold $mathcal{M}$ embedded in an $m$-dimensional space, the kernel MMD two-sample test for data sampled from a pair of distributions $(p, q)$ that are Holder with order $beta$ is consistent and powerful when the number of samples $n$ is greater than $delta_2(p,q)^{-2-d/beta}$ up to certain constant, where $delta_2$ is the squared $ell_2$-divergence between two distributions on manifold. Moreover, to achieve testing consistency under this scaling of $n$, our theory suggests that the kernel bandwidth $gamma$ scales with $n^{-1/(d+2beta)}$. These results indicate that the kernel MMD two-sample test does not have a curse-of-dimensionality when the data lie on the low-dimensional manifold. We demonstrate the validity of our theory and the property of the MMD test for manifold data using several numerical experiments.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا