ترغب بنشر مسار تعليمي؟ اضغط هنا

Deciphering gene regulation from gene expression dynamics using deep neural network

94   0   0.0 ( 0 )
 نشر من قبل Jingxiang Shen
 تاريخ النشر 2018
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Complex biological functions are carried out by the interaction of genes and proteins. Uncovering the gene regulation network behind a function is one of the central themes in biology. Typically, it involves extensive experiments of genetics, biochemistry and molecular biology. In this paper, we show that much of the inference task can be accomplished by a deep neural network (DNN), a form of machine learning or artificial intelligence. Specifically, the DNN learns from the dynamics of the gene expression. The learnt DNN behaves like an accurate simulator of the system, on which one can perform in-silico experiments to reveal the underlying gene network. We demonstrate the method with two examples: biochemical adaptation and the gap-gene patterning in fruit fly embryogenesis. In the first example, the DNN can successfully find the two basic network motifs for adaptation - the negative feedback and the incoherent feed-forward. In the second and much more complex example, the DNN can accurately predict behaviors of essentially all the mutants. Furthermore, the regulation network it uncovers is strikingly similar to the one inferred from experiments. In doing so, we develop methods for deciphering the gene regulation network hidden in the DNN black box. Our interpretable DNN approach should have broad applications in genotype-phenotype mapping.



قيم البحث

اقرأ أيضاً

Inferring functional relationships within complex networks from static snapshots of a subset of variables is a ubiquitous problem in science. For example, a key challenge of systems biology is to translate cellular heterogeneity data obtained from si ngle-cell sequencing or flow-cytometry experiments into regulatory dynamics. We show how static population snapshots of co-variability can be exploited to rigorously infer properties of gene expression dynamics when gene expression reporters probe their upstream dynamics on separate time-scales. This can be experimentally exploited in dual-reporter experiments with fluorescent proteins of unequal maturation times, thus turning an experimental bug into an analysis feature. We derive correlation conditions that detect the presence of closed-loop feedback regulation in gene regulatory networks. Furthermore, we show how genes with cell-cycle dependent transcription rates can be identified from the variability of co-regulated fluorescent proteins. Similar correlation constraints might prove useful in other areas of science in which static correlation snapshots are used to infer causal connections between dynamically interacting components.
In unicellular organisms such as bacteria the same acquired mutations beneficial in one environment can be restrictive in another. However, evolving Escherichia coli populations demonstrate remarkable flexibility in adaptation. The mechanisms sustain ing genetic flexibility remain unclear. In E. coli the transcriptional regulation of gene expression involves both dedicated regulators binding specific DNA sites with high affinity and also global regulators - abundant DNA architectural proteins of the bacterial chromoid binding multiple low affinity sites and thus modulating the superhelical density of DNA. The first form of transcriptional regulation is dominantly pairwise and specific, representing digitial control, while the second form is (in strength and distribution) continuous, representing analog control. Here we look at the properties of effective networks derived from significant gene expression changes under variation of the two forms of control and find that upon limitations of one type of control (caused e.g. by mutation of a global DNA architectural factor) the other type can compensate for compromised regulation. Mutations of global regulators significantly enhance the digital control; in the presence of global DNA architectural proteins regulation is mostly of the analog type, coupling spatially neighboring genomic loci; together our data suggest that two logically distinct types of control are balancing each other. By revealing two distinct logical types of control, our approach provides basic insights into both the organizational principles of transcriptional regulation and the mechanisms buffering genetic flexibility. We anticipate that the general concept of distinguishing logical types of control will apply to many complex biological networks.
A principal component analysis of the TCGA data for 15 cancer localizations unveils the following qualitative facts about tumors: 1) The state of a tissue in gene expression space may be described by a few variables. In particular, there is a single variable describing the progression from a normal tissue to a tumor. 2) Each cancer localization is characterized by a gene expression profile, in which genes have specific weights in the definition of the cancer state. There are no less than 2500 differentially-expressed genes, which lead to power-like tails in the expression distribution functions. 3) Tumors in different localizations share hundreds or even thousands of differentially expressed genes. There are 6 genes common to the 15 studied tumor localizations. 4) The tumor region is a kind of attractor. Tumors in advanced stages converge to this region independently of patient age or genetic variability. 5) There is a landscape of cancer in gene expression space with an approximate border separating normal tissues from tumors.
We train a neural network to predict chemical toxicity based on gene expression data. The input to the network is a full expression profile collected either in vitro from cultured cells or in vivo from live animals. The output is a set of fine graine d predictions for the presence of a variety of pathological effects in treated animals. When trained on the Open TG-GATEs database it produces good results, outperforming classical models trained on the same data. This is a promising approach for efficiently screening chemicals for toxic effects, and for more accurately evaluating drug candidates based on preclinical data.
77 - Olga Zolotareva 2020
Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequentl y employed to pool local results. However, if class labels are inhomogeneously distributed between cohorts, their accuracy may drop. Flimma (https://exbio.wzw.tum.de/flimma/) addresses this issue by implementing the state-of-the-art workflow limma voom in a privacy-preserving manner, i.e. patient data never leaves its source site. Flimma results are identical to those generated by limma voom on combined datasets even in imbalanced scenarios where meta-analysis approaches fail.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا