Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Using ontology embeddings for structural inductive bias in gene expression data analysis

112 0 0.0 ( 0 )

Download Cite

Added by Maja Tr\\k{e}bacz

Publication date 2020

fields Biology Informatics Engineering

and research's language is English

Authors Maja Trk{e}bacz - Zohreh Shams - Mateja Jamnik

Genomics Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Stratifying cancer patients based on their gene expression levels allows improving diagnosis, survival analysis and treatment planning. However, such data is extremely highly dimensional as it contains expression values for over 20000 genes per patient, and the number of samples in the datasets is low. To deal with such settings, we propose to incorporate prior biological knowledge about genes from ontologies into the machine learning system for the task of patient classification given their gene expression data. We use ontology embeddings that capture the semantic similarities between the genes to direct a Graph Convolutional Network, and therefore sparsify the network connections. We show this approach provides an advantage for predicting clinical targets from high-dimensional low-sample data.

rate research

Regularization Strategies for Hyperplane Classifiers: Application to Cancer Classification with Gene Expression Data

89 - Erik Andries 2006

Linear discrimination, from the point of view of numerical linear algebra, can be treated as solving an ill-posed system of linear equations. In order to generate a solution that is robust in the presence of noise, these problems require regularization. Here, we examine the ill-posedness involved in the linear discrimination of cancer gene expression data with respect to outcome and tumor subclasses. We show that a filter factor representation, based upon Singular Value Decomposition, yields insight into the numerical ill-posedness of the hyperplane-based separation when applied to gene expression data. We also show that this representation yields useful diagnostic tools for guiding the selection of classifier parameters, thus leading to improved performance.

Genomics

Prediction of gene expression time series and structural analysis of gene regulatory networks using recurrent neural networks

131 - Michele Monti , Jonathan Fiorentino , Edoardo Milanetti 2021

Methods for time series prediction and classification of gene regulatory networks (GRNs) from gene expression data have been treated separately so far. The recent emergence of attention-based recurrent neural networks (RNN) models boosted the interpretability of RNN parameters, making them appealing for the understanding of gene interactions. In this work, we generated synthetic time series gene expression data from a range of archetypal GRNs and we relied on a dual attention RNN to predict the gene temporal dynamics. We show that the prediction is extremely accurate for GRNs with different architectures. Next, we focused on the attention mechanism of the RNN and, using tools from graph theory, we found that its graph properties allow to hierarchically distinguish different architectures of the GRN. We show that the GRNs respond differently to the addition of noise in the prediction by the RNN and we relate the noise response to the analysis of the attention mechanism. In conclusion, this work provides a a way to understand and exploit the attention mechanism of RNN and it paves the way to RNN-based methods for time series prediction and inference of GRNs from gene expression data.

Biological Physics Machine Learning Data Analysis Statistics and Probability

Predicting Toxicity from Gene Expression with Neural Networks

156 - Peter Eastman , Vijay S. Pande 2019

We train a neural network to predict chemical toxicity based on gene expression data. The input to the network is a full expression profile collected either in vitro from cultured cells or in vivo from live animals. The output is a set of fine grained predictions for the presence of a variety of pathological effects in treated animals. When trained on the Open TG-GATEs database it produces good results, outperforming classical models trained on the same data. This is a promising approach for efficiently screening chemicals for toxic effects, and for more accurately evaluating drug candidates based on preclinical data.

Genomics

Predicting Gene Expression Between Species with Neural Networks

121 - Peter Eastman , Vijay S. Pande 2019

We train a neural network to predict human gene expression levels based on experimental data for rat cells. The network is trained with paired human/rat samples from the Open TG-GATES database, where paired samples were treated with the same compound at the same dose. When evaluated on a test set of held out compounds, the network successfully predicts human expression levels. On the majority of the test compounds, the list of differentially expressed genes determined from predicted expression levels agrees well with the list of differentially expressed genes determined from actual human experimental data.

Genomics

Two distinct logical types of network control in gene expression profiles

704 - Carsten Marr , Marcel Geertz , Marc-Thorsten Huett 2007

In unicellular organisms such as bacteria the same acquired mutations beneficial in one environment can be restrictive in another. However, evolving Escherichia coli populations demonstrate remarkable flexibility in adaptation. The mechanisms sustaining genetic flexibility remain unclear. In E. coli the transcriptional regulation of gene expression involves both dedicated regulators binding specific DNA sites with high affinity and also global regulators - abundant DNA architectural proteins of the bacterial chromoid binding multiple low affinity sites and thus modulating the superhelical density of DNA. The first form of transcriptional regulation is dominantly pairwise and specific, representing digitial control, while the second form is (in strength and distribution) continuous, representing analog control. Here we look at the properties of effective networks derived from significant gene expression changes under variation of the two forms of control and find that upon limitations of one type of control (caused e.g. by mutation of a global DNA architectural factor) the other type can compensate for compromised regulation. Mutations of global regulators significantly enhance the digital control; in the presence of global DNA architectural proteins regulation is mostly of the analog type, coupling spatially neighboring genomic loci; together our data suggest that two logically distinct types of control are balancing each other. By revealing two distinct logical types of control, our approach provides basic insights into both the organizational principles of transcriptional regulation and the mechanisms buffering genetic flexibility. We anticipate that the general concept of distinguishing logical types of control will apply to many complex biological networks.

Genomics Molecular Networks

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Using ontology embeddings for structural inductive bias in gene expression data analysis

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions