A hierarchical Bayesian model to find brain-behaviour associations in incomplete data sets

81 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Fabio Ferreira

تاريخ النشر 2021

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Fabio S. Ferreira - Agoston Mihalik - Rick A. Adams

التعلم الالي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Canonical Correlation Analysis (CCA) and its regularis

قيم البحث

82 - Dominik Linzner , Michael Schmidt , Heinz Koeppl 2019

Continuous-time Bayesian Networks (CTBNs) represent a compact yet powerful framework for understanding multivariate time-series data. Given complete data, parameters and structure can be estimated efficiently in closed-form. However, if data is incom plete, the latent states of the CTBN have to be estimated by laboriously simulating the intractable dynamics of the assumed CTBN. This is a problem, especially for structure learning tasks, where this has to be done for each element of a super-exponentially growing set of possible structures. In order to circumvent this notorious bottleneck, we develop a novel gradient-based approach to structure learning. Instead of sampling and scoring all possible structures individually, we assume the generator of the CTBN to be composed as a mixture of generators stemming from different structures. In this framework, structure learning can be performed via a gradient-based optimization of mixture weights. We combine this approach with a new variational method that allows for a closed-form calculation of this mixture marginal likelihood. We show the scalability of our method by learning structures of previously inaccessible sizes from synthetic and real-world data.

التعلم الالي التعلم الآلي

$(1 + varepsilon)$-class Classification: an Anomaly Detection Method for Highly Imbalanced or Incomplete Data Sets

75 - Maxim Borisyak , Artem Ryzhikov , Andrey Ustyuzhanin 2019

Anomaly detection is not an easy problem since distribution of anomalous samples is unknown a priori. We explore a novel method that gives a trade-off possibility between one-class and two-class approaches, and leads to a better performance on anomal y detection problems with small or non-representative anomalous samples. The method is evaluated using several data sets and compared to a set of conventional one-class and two-class approaches.

التعلم الالي التعلم الآلي

Trace your sources in large-scale data: one ring to find them all

176 - Alexander Bottcher , Wieland Brendel , Bernhard Englitz 2018

An important preprocessing step in most data analysis pipelines aims to extract a small set of sources that explain most of the data. Currently used algorithms for blind source separation (BSS), however, often fail to extract the desired sources and need extensive cross-validation. In contrast, their rarely used probabilistic counterparts can get away with little cross-validation and are more accurate and reliable but no simple and scalable implementations are available. Here we present a novel probabilistic BSS framework (DECOMPOSE) that can be flexibly adjusted to the data, is extensible and easy to use, adapts to individual sources and handles large-scale data through algorithmic efficiency. DECOMPOSE encompasses and generalises many traditional BSS algorithms such as PCA, ICA and NMF and we demonstrate substantial improvements in accuracy and robustness on artificial and real data.

التعلم الالي التعلم الآلي المنهجية

A Bayesian Approach to Modelling Longitudinal Data in Electronic Health Records

101 - Alexis Bellot , Mihaela van der Schaar 2019

Analyzing electronic health records (EHR) poses significant challenges because often few samples are available describing a patients health and, when available, their information content is highly diverse. The problem we consider is how to integrate sparsely sampled longitudinal data, missing measurements informative of the underlying health status and fixed demographic information to produce estimated survival distributions updated through a patients follow up. We propose a nonparametric probabilistic model that generates survival trajectories from an ensemble of Bayesian trees that learns variable interactions over time without specifying beforehand the longitudinal process. We show performance improvements on Primary Biliary Cirrhosis patient data.

التعلم الالي التعلم الآلي تطبيقات الإحصاء

MIWAE: Deep Generative Modelling and Imputation of Incomplete Data

138 - Pierre-Alexandre Mattei , Jes Frellsen 2018

We consider the problem of handling missing data with deep latent variable models (DLVMs). First, we present a simple technique to train DLVMs when the training set contains missing-at-random data. Our approach, called MIWAE, is based on the importan ce-weighted autoencoder (IWAE), and maximises a potentially tight lower bound of the log-likelihood of the observed data. Compared to the original IWAE, our algorithm does not induce any additional computational overhead due to the missing data. We also develop Monte Carlo techniques for single and multiple imputation using a DLVM trained on an incomplete data set. We illustrate our approach by training a convolutional DLVM on a static binarisation of MNIST that contains 50% of missing pixels. Leveraging multiple imputation, a convolutional network trained on these incomplete digits has a test performance similar to one trained on complete data. On various continuous and binary data sets, we also show that MIWAE provides accurate single imputations, and is highly competitive with state-of-the-art methods.

التعلم الالي التعلم الآلي المنهجية