The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA

79 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Luigi Gresele

تاريخ النشر 2019

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Luigi Gresele - Paul K. Rubenstein - Arash Mehrjou

التعلم الالي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider the problem of recovering a common latent source with independent components from multiple views. This applies to settings in which a variable is measured with multiple experimental modalities, and where the goal is to synthesize the disparate measurements into a single unified representation. We consider the case that the observed views are a nonlinear mixing of component-wise corruptions of the sources. When the views are considered separately, this reduces to nonlinear Independent Component Analysis (ICA) for which it is provably impossible to undo the mixing. We present novel identifiability proofs that this is possible when the multiple views are considered jointly, showing that the mixing can theoretically be undone using function approximators such as deep neural networks. In contrast to known identifiability results for nonlinear ICA, we prove that independent latent sources with arbitrary mixing can be recovered as long as multiple, sufficiently different noisy views are available.

قيم البحث

117 - Hermanni Halva , Sylvain Le Corff , Luc Lehericy 2021

We introduce a new general identifiable framework for principled disentanglement referred to as Structured Nonlinear Independent Component Analysis (SNICA). Our contribution is to extend the identifiability theory of deep generative models for a very broad class of structured models. While previous works have shown identifiability for specific classes of time-series models, our theorems extend this to more general temporal structures as well as to models with more complex structures such as spatial dependencies. In particular, we establish the major result that identifiability for this framework holds even in the presence of noise of unknown distribution. The SNICA setting therefore subsumes all the existing nonlinear ICA models for time-series and also allows for new much richer identifiable models. Finally, as an example of our frameworks flexibility, we introduce the first nonlinear ICA model for time-series that combines the following very useful properties: it accounts for both nonstationarity and autocorrelation in a fully unsupervised setting; performs dimensionality reduction; models hidden states; and enables principled estimation and inference by variational maximum-likelihood.

التعلم الالي التعلم الآلي

Discovering Latent Causal Variables via Mechanism Sparsity: A New Principle for Nonlinear ICA

82 - Sebastien Lachapelle , Pau Rodriguez Lopez , Remi Le Priol 2021

It can be argued that finding an interpretable low-dimensional representation of a potentially high-dimensional phenomenon is central to the scientific enterprise. Independent component analysis (ICA) refers to an ensemble of methods which formalize this goal and provide estimation procedure for practical application. This work proposes mechanism sparsity regularization as a new principle to achieve nonlinear ICA when latent factors depend sparsely on observed auxiliary variables and/or past latent factors. We show that the latent variables can be recovered up to a permutation if one regularizes the latent mechanisms to be sparse and if some graphical criterion is satisfied by the data generating process. As a special case, our framework shows how one can leverage unknown-target interventions on the latent factors to disentangle them, thus drawing further connections between ICA and causality. We validate our theoretical results with toy experiments.

التعلم الالي التعلم الآلي

Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis

85 - Andreas Damianou , Neil D. Lawrence , Carl Henrik Ek 2016

Factor analysis aims to determine latent factors, or traits, which summarize a given data set. Inter-battery factor analysis extends this notion to multiple views of the data. In this paper we show how a nonlinear, nonparametric version of these mode ls can be recovered through the Gaussian process latent variable model. This gives us a flexible formalism for multi-view learning where the latent variables can be used both for exploratory purposes and for learning representations that enable efficient inference for ambiguous estimation tasks. Learning is performed in a Bayesian manner through the formulation of a variational compression scheme which gives a rigorous lower bound on the log likelihood. Our Bayesian framework provides strong regularization during training, allowing the structure of the latent space to be determined efficiently and automatically. We demonstrate this by producing the first (to our knowledge) published results of learning from dozens of views, even when data is scarce. We further show experimental results on several different types of multi-view data sets and for different kinds of tasks, including exploratory data analysis, generation, ambiguity modelling through latent priors and classification.

التعلم الالي التعلم الآلي الاحتمالات

A Rosetta Stone for protoplanetary disks: The synergy of multi-wavelength observations

70 - A. Sicilia-Aguilar , A. Banzatti , A. Carmona 2016

The recent progress in instrumentation and telescope development has brought us different ways to observe protoplanetary disks, including interferometers, space missions, adaptive optics, polarimetry, and time- and spectrally-resolved data. While the new facilities have changed the way we can tackle the existing open problems in disk structure and evolution, there is a substantial lack of interconnection between different observing techniques and their user communities. Here, we explore the complementarity of some of the state-of-the-art observing techniques, and how they can be brought together in a collective effort to understand how disks evolve and disperse at the time of planet formation. This paper was born at the Protoplanetary Discussions meeting in Edinburgh, 2016. Its goal is to clarify where multi-wavelength observations of disks converge in unveiling disk structure and evolution, and where they diverge and challenge our current understanding. We discuss caveats that should be considered when linking results from different observations, or when drawing conclusions based on limited datasets (in terms of wavelength or sample). We focus on disk properties that are currently being revolutionized by multi-wavelength observations. Specifically: the inner disk radius, holes and gaps and their link to large-scale disk structures, the disk mass, and the accretion rate. We discuss how the links between them, as well as the apparent contradictions, can help us to disentangle the disk physics and to learn about disk evolution.

الفيزياء الفلكية الشمسية والنجوم الأراضي والفيزياء الفلكية الكلية

View selection in multi-view stacking: Choosing the meta-learner

231 - Wouter van Loon , Marjolein Fokkema , Botond Szabo 2020

Multi-view stacking is a framework for combining information from different views (i.e. different feature sets) describing the same set of objects. In this framework, a base-learner algorithm is trained on each view separately, and their predictions are then combined by a meta-learner algorithm. In a previous study, stacked penalized logistic regression, a special case of multi-view stacking, has been shown to be useful in identifying which views are most important for prediction. In this article we expand this research by considering seven different algorithms to use as the meta-learner, and evaluating their view selection and classification performance in simulations and two applications on real gene-expression data sets. Our results suggest that if both view selection and classification accuracy are important to the research at hand, then the nonnegative lasso, nonnegative adaptive lasso and nonnegative elastic net are suitable meta-learners. Exactly which among these three is to be preferred depends on the research context. The remaining four meta-learners, namely nonnegative ridge regression, nonnegative forward selection, stability selection and the interpolating predictor, show little advantages in order to be preferred over the other three.

التعلم الالي التعلم الآلي المنهجية