Community-Wide Evaluation of Computational Function Prediction

47 0 0.0 ( 0 )

Download Cite

Added by Iddo Friedberg

Publication date 2016

fields Biology

and research's language is English

Authors Iddo Friedberg - Predrag Radivojac

Quantitative Methods

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

A biological experiment is the most reliable way of assigning function to a protein. However, in the era of high-throughput sequencing, scientists are unable to carry out experiments to determine the function of every single gene product. Therefore, to gain insights into the activity of these molecules and guide experiments, we must rely on computational means to functionally annotate the majority of sequence data. To understand how well these algorithms perform, we have established a challenge involving a broad scientific community in which we evaluate different annotation methods according to their ability to predict the associations between previously unannotated protein sequences and Gene Ontology terms. Here we discuss the rationale, benefits and issues associated with evaluating computational methods in an ongoing community-wide challenge.

rate research

Computational diagnosis and risk evaluation for canine lymphoma

537 - E. M. Mirkes , I. Alexandrakis , K. Slater 2013

The canine lymphoma blood test detects the levels of two biomarkers, the acute phase proteins (C-Reactive Protein and Haptoglobin). This test can be used for diagnostics, for screening, and for remission monitoring as well. We analyze clinical data, test various machine learning methods and select the best approach to these problems. Three family of methods, decision trees, kNN (including advanced and adaptive kNN) and probability density evaluation with radial basis functions, are used for classification and risk estimation. Several pre-processing approaches were implemented and compared. The best of them are used to create the diagnostic system. For the differential diagnosis the best solution gives the sensitivity and specificity of 83.5% and 77%, respectively (using three input features, CRP, Haptoglobin and standard clinical symptom). For the screening task, the decision tree method provides the best result, with sensitivity and specificity of 81.4% and >99%, respectively (using the same input features). If the clinical symptoms (Lymphadenopathy) are considered as unknown then a decision tree with CRP and Hapt only provides sensitivity 69% and specificity 83.5%. The lymphoma risk evaluation problem is formulated and solved. The best models are selected as the system for computational lymphoma diagnosis and evaluation the risk of lymphoma as well. These methods are implemented into a special web-accessed software and are applied to problem of monitoring dogs with lymphoma after treatment. It detects recurrence of lymphoma up to two months prior to the appearance of clinical signs. The risk map visualisation provides a friendly tool for explanatory data analysis.

Quantitative Methods Applications

Collective properties of cellular identity: a computational approach

393 - Bradly Alicea 2013

Cell type (e.g. pluripotent cell, fibroblast) is the end result of many complex processes that unfold due to evolutionary, developmental, and transformational stimuli. A cells phenotype and the discrete, a priori states that define various cell subtypes (e.g. skin fibroblast, embryonic stem cell) are ultimately part of a continuum that may predict changes and systematic variation in cell subtypes. These features can be both observable in existing cellular states and hypothetical (e.g. unobserved). In this paper, a series of approaches will be used to approximate the continuous diversity of gene expression across a series of pluripotent, totipotent, and fibroblast cellular subtypes. We will use a series of previously-collected datasets and analyze them using three complementary approaches: the computation of distances based on the subsampling of diversity, assessing the separability of individual genes for a specific cell line both within and between cell types, and a hierarchical soft classification technique that will assign a membership value for specific genes in specific cell types given a number of different criteria. These approaches will allow us to assess the observed gene-expression diversity in these datasets, as well as assess how well a priori cell types characterize their constituent populations. In conclusion, the application of these findings to a broader biological context will be discussed.

Quantitative Methods Genomics

Essential guidelines for computational method benchmarking

111 - Lukas M. Weber , Wouter Saelens , Robrecht Cannoodt 2018

In computational biology and other sciences, researchers are frequently faced with a choice between several computational methods for performing data analyses. Benchmarking studies aim to rigorously compare the performance of different methods using well-characterized benchmark datasets, to determine the strengths of each method or to provide recommendations regarding suitable choices of methods for an analysis. However, benchmarking studies must be carefully designed and implemented to provide accurate, unbiased, and informative results. Here, we summarize key practical guidelines and recommendations for performing high-quality benchmarking analyses, based on our experiences in computational biology.

Quantitative Methods Applications

Minimizing the number of optimizations for efficient community dynamic flux balance analysis

52 - James D. Brunner , Nicholas Chia 2020

Dynamic flux balance analysis uses a quasi-steady state assumption to calculate an organisms metabolic activity at each time-step of a dynamic simulation, using the well-known technique of flux balance analysis. For microbial communities, this calculation is especially costly and involves solving a linear constrained optimization problem for each member of the community at each time step. However, this is unnecessary and inefficient, as prior solutions can be used to inform future time steps. Here, we show that a basis for the space of internal fluxes can be chosen for each microbe in a community and this basis can be used to simulate forward by solving a relatively inexpensive system of linear equations at most time steps. We can use this solution as long as the resulting metabolic activity remains within the optimization problems constraints (i.e. the solution to the linear system of equations remains a feasible to the linear program). As the solution becomes infeasible, it first becomes a feasible but degenerate solution to the optimization problem, and we can solve a different but related optimization problem to choose an appropriate basis to continue forward simulation. We demonstrate the efficiency and robustness of our method by comparing with currently used methods on a four species community, and show that our method requires at least $91%$ fewer optimizations to be solved. For reproducibility, we prototyped the method using Python. Source code is available at verb|https://github.com/jdbrunner/surfin_fba|.

Quantitative Methods

On the Origins and Control of Community Types in the Human Microbiome

433 - Travis E. Gibson , Amir Bashan , Hong-Tai Cao 2015

Microbiome-based stratification of healthy individuals into compositional categories, referred to as community types, holds promise for drastically improving personalized medicine. Despite this potential, the existence of community types and the degree of their distinctness have been highly debated. Here we adopted a dynamic systems approach and found that heterogeneity in the interspecific interactions or the presence of strongly interacting species is sufficient to explain community types, independent of the topology of the underlying ecological network. By controlling the presence or absence of these strongly interacting species we can steer the microbial ecosystem to any desired community type. This open-loop control strategy still holds even when the community types are not distinct but appear as dense regions within a continuous gradient. This finding can be used to develop viable therapeutic strategies for shifting the microbial composition to a healthy configuration

Quantitative Methods Systems and Control