ترغب بنشر مسار تعليمي؟ اضغط هنا

A Comparative Study of Machine Learning Methods for Predicting the Evolution of Brain Connectivity from a Baseline Timepoint

103   0   0.0 ( 0 )
 نشر من قبل Islem Rekik
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Predicting the evolution of the brain network, also called connectome, by foreseeing changes in the connectivity weights linking pairs of anatomical regions makes it possible to spot connectivity-related neurological disorders in earlier stages and detect the development of potential connectomic anomalies. Remarkably, such a challenging prediction problem remains least explored in the predictive connectomics literature. It is a known fact that machine learning (ML) methods have proven their predictive abilities in a wide variety of computer vision problems. However, ML techniques specifically tailored for the prediction of brain connectivity evolution trajectory from a single timepoint are almost absent. To fill this gap, we organized a Kaggle competition where 20 competing teams designed advanced machine learning pipelines for predicting the brain connectivity evolution from a single timepoint. The competing teams developed their ML pipelines with a combination of data pre-processing, dimensionality reduction, and learning methods. Utilizing an inclusive evaluation approach, we ranked the methods based on two complementary evaluation metrics (mean absolute error (MAE) and Pearson Correlation Coefficient (PCC)) and their performances using different training and testing data perturbation strategies (single random split and cross-validation). The final rank was calculated using the rank product for each competing team across all evaluation measures and validation strategies. In support of open science, the developed 20 ML pipelines along with the connectomic dataset are made available on GitHub. The outcomes of this competition are anticipated to lead to the further development of predictive models that can foresee the evolution of brain connectivity over time, as well as other types of networks (e.g., genetic networks).

قيم البحث

اقرأ أيضاً

Accurate predictions of reactive mixing are critical for many Earth and environmental science problems. To investigate mixing dynamics over time under different scenarios, a high-fidelity, finite-element-based numerical model is built to solve the fa st, irreversible bimolecular reaction-diffusion equations to simulate a range of reactive-mixing scenarios. A total of 2,315 simulations are performed using different sets of model input parameters comprising various spatial scales of vortex structures in the velocity field, time-scales associated with velocity oscillations, the perturbation parameter for the vortex-based velocity, anisotropic dispersion contrast, and molecular diffusion. Outputs comprise concentration profiles of the reactants and products. The inputs and outputs of these simulations are concatenated into feature and label matrices, respectively, to train 20 different machine learning (ML) emulators to approximate system behavior. The 20 ML emulators based on linear methods, Bayesian methods, ensemble learning methods, and multilayer perceptron (MLP), are compared to assess these models. The ML emulators are specifically trained to classify the state of mixing and predict three quantities of interest (QoIs) characterizing species production, decay, and degree of mixing. Linear classifiers and regressors fail to reproduce the QoIs; however, ensemble methods (classifiers and regressors) and the MLP accurately classify the state of reactive mixing and the QoIs. Among ensemble methods, random forest and decision-tree-based AdaBoost faithfully predict the QoIs. At run time, trained ML emulators are $approx10^5$ times faster than the high-fidelity numerical simulations. Speed and accuracy of the ensemble and MLP models facilitate uncertainty quantification, which usually requires 1,000s of model run, to estimate the uncertainty bounds on the QoIs.
69 - Rui He , Shan He , Ke Tang 2021
Building classifiers on multiple domains is a practical problem in the real life. Instead of building classifiers one by one, multi-domain learning (MDL) simultaneously builds classifiers on all the domains. MDL utilizes the information shared among the domains to improve the performance. As a supervised learning problem, the labeling effort is still high in MDL problems. Usually, this high labeling cost issue could be relieved by using active learning. Thus, it is natural to utilize active learning to reduce the labeling effort in MDL, and we refer this setting as multi-domain active learning (MDAL). However, there are only few works which are built on this setting. And when the researchers have to face this problem, there is no off-the-shelf solution. Under this circumstance, combining the current multi-domain learning models and single-domain active learning strategies might be a preliminary solution for MDAL problem. To find out the potential of this preliminary solution, a comparative study over 5 models and 4 active learning strategies is made in this paper. To the best of our knowledge, this is the first work provides the formal definition of MDAL. Besides, this is the first comparative work for MDAL problem. From the results, the Multinomial Adversarial Networks (MAN) model with a simple best vs second best (BvSB) uncertainty strategy shows its superiority in most cases. We take this combination as our off-the-shelf recommendation for the MDAL problem.
Scientific Computing relies on executing computer algorithms coded in some programming languages. Given a particular available hardware, algorithms speed is a crucial factor. There are many scientific computing environments used to code such algorith ms. Matlab is one of the most tremendously successful and widespread scientific computing environments that is rich of toolboxes, libraries, and data visualization tools. OpenCV is a (C++)-based library written primarily for Computer Vision and its related areas. This paper presents a comparative study using 20 different real datasets to compare the speed of Matlab and OpenCV for some Machine Learning algorithms. Although Matlab is more convenient in developing and data presentation, OpenCV is much faster in execution, where the speed ratio reaches more than 80 in some cases. The best of two worlds can be achieved by exploring using Matlab or similar environments to select the most successful algorithm; then, implementing the selected algorithm using OpenCV or similar environments to gain a speed factor.
In the field of reproductive health, a vital aspect for the detection of male fertility issues is the analysis of human semen quality. Two factors of importance are the morphology and motility of the sperm cells. While the former describes defects in different parts of a spermatozoon, the latter measures the efficient movement of cells. For many non-human species, so-called Computer-Aided Sperm Analysis systems work well for assessing these characteristics from microscopic video recordings but struggle with human sperm samples which generally show higher degrees of debris and dead spermatozoa, as well as lower overall sperm motility. Here, machine learning methods that harness large amounts of training data to extract salient features could support physicians with the detection of fertility issues or in vitro fertilisation procedures. In this work, the overall motility of given sperm samples is predicted with the help of a machine learning framework integrating unsupervised methods for feature extraction with downstream regression models. The models evaluated herein improve on the state-of-the-art for video-based sperm-motility prediction.
Understanding and removing bias from the decisions made by machine learning models is essential to avoid discrimination against unprivileged groups. Despite recent progress in algorithmic fairness, there is still no clear answer as to which bias-miti gation approaches are most effective. Evaluation strategies are typically use-case specific, rely on data with unclear bias, and employ a fixed policy to convert model outputs to decision outcomes. To address these problems, we performed a systematic comparison of a number of popular fairness algorithms applicable to supervised classification. Our study is the most comprehensive of its kind. It utilizes three real and four synthetic datasets, and two different ways of converting model outputs to decisions. It considers fairness, predictive-performance, calibration quality, and speed of 28 different modelling pipelines, corresponding to both fairness-unaware and fairness-aware algorithms. We found that fairness-unaware algorithms typically fail to produce adequately fair models and that the simplest algorithms are not necessarily the fairest ones. We also found that fairness-aware algorithms can induce fairness without material drops in predictive power. Finally, we found that dataset idiosyncracies (e.g., degree of intrinsic unfairness, nature of correlations) do affect the performance of fairness-aware approaches. Our results allow the practitioner to narrow down the approach(es) they would like to adopt without having to know in advance their fairness requirements.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا