Estimation of Bivariate Structural Causal Models by Variational Gaussian Process Regression Under Likelihoods Parametrised by Normalising Flows

87 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Nico Reick

تاريخ النشر 2021

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Nico Reick - Felix Wiewel - Alexander Bartler

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

One major drawback of state-of-the-art artificial intelligence is its lack of explainability. One approach to solve the problem is taking causality into account. Causal mechanisms can be described by structural causal models. In this work, we propose a method for estimating bivariate structural causal models using a combination of normalising flows applied to density estimation and variational Gaussian process regression for post-nonlinear models. It facilitates causal discovery, i.e. distinguishing cause and effect, by either the independence of cause and residual or a likelihood ratio test. Our method which estimates post-nonlinear models can better explain a variety of real-world cause-effect pairs than a simple additive noise model. Though it remains difficult to exploit this benefit regarding all pairs from the Tubingen benchmark database, we demonstrate that combining the additive noise model approach with our method significantly enhances causal discovery.

قيم البحث

114 - Paul K. Rubenstein , Sebastian Weichwald , Stephan Bongers 2017

Complex systems can be modelled at various levels of detail. Ideally, causal models of the same system should be consistent with one another in the sense that they agree in their predictions of the effects of interventions. We formalise this notion o f consistency in the case of Structural Equation Models (SEMs) by introducing exact transformations between SEMs. This provides a general language to consider, for instance, the different levels of description in the following three scenarios: (a) models with large numbers of variables versus models in which the `irrelevant or unobservable variables have been marginalised out; (b) micro-level models versus macro-level models in which the macro-variables are aggregate features of the micro-variables; (c) dynamical time series models versus models of their stationary behaviour. Our analysis stresses the importance of well specified interventions in the causal modelling process and sheds light on the interpretation of cyclic SEMs.

التعلم الالي الذكاء الاصطناعي التعلم الآلي

Domain adaptation under structural causal models

123 - Yuansi Chen , Peter Buhlmann 2020

Domain adaptation (DA) arises as an important problem in statistical machine learning when the source data used to train a model is different from the target data used to test the model. Recent advances in DA have mainly been application-driven and h ave largely relied on the idea of a common subspace for source and target data. To understand the empirical successes and failures of DA methods, we propose a theoretical framework via structural causal models that enables analysis and comparison of the prediction performance of DA methods. This framework also allows us to itemize the assumptions needed for the DA methods to have a low target error. Additionally, with insights from our theory, we propose a new DA method called CIRM that outperforms existing DA methods when both the covariates and label distributions are perturbed in the target data. We complement the theoretical analysis with extensive simulations to show the necessity of the devised assumptions. Reproducible synthetic and real data experiments are also provided to illustrate the strengths and weaknesses of DA methods when parts of the assumptions of our theory are violated.

التعلم الالي التعلم الآلي نظرية الإحصاء

Conditional Density Estimation with Bayesian Normalising Flows

267 - Brian L Trippe , Richard E Turner 2018

Modeling complex conditional distributions is critical in a variety of settings. Despite a long tradition of research into conditional density estimation, current methods employ either simple parametric forms or are difficult to learn in practice. Th is paper employs normalising flows as a flexible likelihood model and presents an efficient method for fitting them to complex densities. These estimators must trade-off between modeling distributional complexity, functional complexity and heteroscedasticity without overfitting. We recognize these trade-offs as modeling decisions and develop a Bayesian framework for placing priors over these conditional density estimators using variational Bayesian neural networks. We evaluate this method on several small benchmark regression datasets, on some of which it obtains state of the art performance. Finally, we apply the method to two spatial density modeling tasks with over 1 million datapoints using the New York City yellow taxi dataset and the Chicago crime dataset.

التعلم الالي

Model Selection for Gaussian Process Regression by Approximation Set Coding

85 - Benjamin Fischer , Nico Gorbach , Stefan Bauer 2016

Gaussian processes are powerful, yet analytically tractable models for supervised learning. A Gaussian process is characterized by a mean function and a covariance function (kernel), which are determined by a model selection criterion. The functions to be compared do not just differ in their parametrization but in their fundamental structure. It is often not clear which function structure to choose, for instance to decide between a squared exponential and a rational quadratic kernel. Based on the principle of approximation set coding, we develop a framework for model selection to rank kernels for Gaussian process regression. In our experiments approximation set coding shows promise to become a model selection criterion competitive with maximum evidence (also called marginal likelihood) and leave-one-out cross-validation.

التعلم الالي

Latent Gaussian Process Regression

101 - Erik Bodin , Neill D. F. Campbell , Carl Henrik Ek 2017

We introduce Latent Gaussian Process Regression which is a latent variable extension allowing modelling of non-stationary multi-modal processes using GPs. The approach is built on extending the input space of a regression problem with a latent variab le that is used to modulate the covariance function over the training data. We show how our approach can be used to model multi-modal and non-stationary processes. We exemplify the approach on a set of synthetic data and provide results on real data from motion capture and geostatistics.

التعلم الالي التعلم الآلي