ترغب بنشر مسار تعليمي؟ اضغط هنا

Variation-preserving normalization unveils blind spots in gene expression profiling

137   0   0.0 ( 0 )
 نشر من قبل Carlos P. Roca
 تاريخ النشر 2015
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

RNA-Seq and gene expression microarrays provide comprehensive profiles of gene activity, but lack of reproducibility has hindered their application. A key challenge in the data analysis is the normalization of gene expression levels, which is currently performed following the implicit assumption that most genes are not differentially expressed. Here, we present a mathematical approach to normalization that makes no assumption of this sort. We have found that variation in gene expression is much larger than currently believed, and that it can be measured with available assays. Our results also explain, at least partially, the reproducibility problems encountered in transcriptomics studies. We expect that this improvement in detection will help efforts to realize the full potential of gene expression profiling, especially in analyses of cellular processes involving complex modulations of gene expression.

قيم البحث

اقرأ أيضاً

Inferring functional relationships within complex networks from static snapshots of a subset of variables is a ubiquitous problem in science. For example, a key challenge of systems biology is to translate cellular heterogeneity data obtained from si ngle-cell sequencing or flow-cytometry experiments into regulatory dynamics. We show how static population snapshots of co-variability can be exploited to rigorously infer properties of gene expression dynamics when gene expression reporters probe their upstream dynamics on separate time-scales. This can be experimentally exploited in dual-reporter experiments with fluorescent proteins of unequal maturation times, thus turning an experimental bug into an analysis feature. We derive correlation conditions that detect the presence of closed-loop feedback regulation in gene regulatory networks. Furthermore, we show how genes with cell-cycle dependent transcription rates can be identified from the variability of co-regulated fluorescent proteins. Similar correlation constraints might prove useful in other areas of science in which static correlation snapshots are used to infer causal connections between dynamically interacting components.
77 - Olga Zolotareva 2020
Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequentl y employed to pool local results. However, if class labels are inhomogeneously distributed between cohorts, their accuracy may drop. Flimma (https://exbio.wzw.tum.de/flimma/) addresses this issue by implementing the state-of-the-art workflow limma voom in a privacy-preserving manner, i.e. patient data never leaves its source site. Flimma results are identical to those generated by limma voom on combined datasets even in imbalanced scenarios where meta-analysis approaches fail.
Interlocus gene conversion (IGC) homogenizes paralogs. Little is known regarding the mutation events that cause IGC and even less is known about the IGC mutations that experience fixation. To disentangle the rates of fixed IGC mutations from the trac t lengths of these fixed mutations, we employ a composite likelihood procedure. We characterize the procedure with simulations. We apply the procedure to duplicated primate introns and to protein-coding paralogs from both yeast and primates. Our estimates from protein-coding data concerning the mean length of fixed IGC tracts were unexpectedly low and are associated with high degrees of uncertainty. In contrast, our estimates from the primate intron data had lengths in the general range expected from IGC mutation studies. While it is challenging to separate the rate at which fixed IGC mutations initiate from the average number of nucleotide positions that these IGC events affect, all of our analyses indicate that IGC is responsible for a substantial proportion of evolutionary change in duplicated regions. Our results suggest that IGC should be considered whenever the evolution of multigene families is examined.
Complex biological functions are carried out by the interaction of genes and proteins. Uncovering the gene regulation network behind a function is one of the central themes in biology. Typically, it involves extensive experiments of genetics, biochem istry and molecular biology. In this paper, we show that much of the inference task can be accomplished by a deep neural network (DNN), a form of machine learning or artificial intelligence. Specifically, the DNN learns from the dynamics of the gene expression. The learnt DNN behaves like an accurate simulator of the system, on which one can perform in-silico experiments to reveal the underlying gene network. We demonstrate the method with two examples: biochemical adaptation and the gap-gene patterning in fruit fly embryogenesis. In the first example, the DNN can successfully find the two basic network motifs for adaptation - the negative feedback and the incoherent feed-forward. In the second and much more complex example, the DNN can accurately predict behaviors of essentially all the mutants. Furthermore, the regulation network it uncovers is strikingly similar to the one inferred from experiments. In doing so, we develop methods for deciphering the gene regulation network hidden in the DNN black box. Our interpretable DNN approach should have broad applications in genotype-phenotype mapping.
130 - Bradly Alicea 2013
The analysis of eight molecular datasets involving human and teleost examples along with morphological samples from several groups of Neotropical electric fish (Order: Gymnotiformes) were used in this thesis to test the dynamics of both intraspecific variation and interspecific diversity. In terms of investigating molecular interspecific diversity among humans, two experimental exercises were performed. A cladistic exchange experiment tested for the extent of discontinuity and interbreeding between H. sapiens and neanderthal populations. As part of the same question, another experimental exercise tested the amount of molecular variance resulting from simulations which treated neanderthals as being either a local population of modern humans or as a distinct subspecies. Finally, comparisons of hominid populations over time with fish species helped to define what constitutes taxonomically relevant differences between morphological populations as expressed among both trait size ranges and through growth patterns that begin during ontogeny. Compared to the subdivision found within selected teleost species, H. sapiens molecular data exhibited little variation and discontinuity between geographical regions. Results of the two experimental exercises concluded that neanderthals exhibit taxonomic distance from modern H. sapiens. However, this distance was not so great as to exclude the possibility of interbreeding between the two subspecific groups. Finally, a series of characters were analyzed among species of Neotropical electric fish. These analyses were compared with hominid examples to determine what constituted taxonomically relevant differences between populations as expressed among specific morphometric traits that develop during the juvenile phase.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا