ﻻ يوجد ملخص باللغة العربية
Motivation: Predictive modelling of gene expression is a powerful framework for the in silico exploration of transcriptional regulatory interactions through the integration of high-throughput -omics data. A major limitation of previous approaches is their inability to handle conditional and synergistic interactions that emerge when collectively analysing genes subject to different regulatory mechanisms. This limitation reduces overall predictive power and thus the reliability of downstream biological inference. Results: We introduce an analytical modelling framework (TREEOME: tree of models of expression) that integrates epigenetic and transcriptomic data by separating genes into putative regulatory classes. Current predictive modelling approaches have found both DNA methylation and histone modification epigenetic data to provide little or no improvement in accuracy of prediction of transcript abundance despite, for example, distinct anti-correlation between mRNA levels and promoter-localised DNA methylation. To improve on this, in TREEOME we evaluate four possible methods of formulating gene-level DNA methylation metrics, which provide a foundation for identifying gene-level methylation events and subsequent differential analysis, whereas most previous techniques operate at the level of individual CpG dinucleotides. We demonstrate TREEOME by integrating gene-level DNA methylation (bisulfite-seq) and histone modification (ChIP-seq) data to accurately predict genome-wide mRNA transcript abundance (RNA-seq) for H1-hESC and GM12878 cell lines. Availability: TREEOME is implemented using open-source software and made available as a pre-configured bootable reference environment. All scripts and data presented in this study are available online at http://sourceforge.net/projects/budden2015treeome/.
We present a nonparametric Bayesian method for disease subtype discovery in multi-dimensional cancer data. Our method can simultaneously analyse a wide range of data types, allowing for both agreement and disagreement between their underlying cluster
Gene transcription is a stochastic process mostly occurring in bursts. Regulation of transcription arises from the interaction of transcription factors (TFs) with the promoter of the gene. The TFs, such as activators and repressors can interact with
This paper introduces a high-throughput software tool framework called {it sam2bam} that enables users to significantly speedup pre-processing for next-generation sequencing data. The sam2bam is especially efficient on single-node multi-core large-me
Network of packages with regulatory interactions (dependences and conflicts) from Debian GNU/Linux operating system is compiled and used as analogy of a gene regulatory network. Using a trace-back algorithm we assembly networks from the potential poo
Advances in DNA sequencing have revolutionized our ability to read genomes. However, even in the most well-studied of organisms, the bacterium ${it Escherichia coli}$, for $approx$ 65$%$ of the promoters we remain completely ignorant of their regulat