No Arabic abstract
Cell type (e.g. pluripotent cell, fibroblast) is the end result of many complex processes that unfold due to evolutionary, developmental, and transformational stimuli. A cells phenotype and the discrete, a priori states that define various cell subtypes (e.g. skin fibroblast, embryonic stem cell) are ultimately part of a continuum that may predict changes and systematic variation in cell subtypes. These features can be both observable in existing cellular states and hypothetical (e.g. unobserved). In this paper, a series of approaches will be used to approximate the continuous diversity of gene expression across a series of pluripotent, totipotent, and fibroblast cellular subtypes. We will use a series of previously-collected datasets and analyze them using three complementary approaches: the computation of distances based on the subsampling of diversity, assessing the separability of individual genes for a specific cell line both within and between cell types, and a hierarchical soft classification technique that will assign a membership value for specific genes in specific cell types given a number of different criteria. These approaches will allow us to assess the observed gene-expression diversity in these datasets, as well as assess how well a priori cell types characterize their constituent populations. In conclusion, the application of these findings to a broader biological context will be discussed.
Exploiting recent developments in information theory, we propose, illustrate, and validate a principled information-theoretic algorithm for module discovery and resulting measure of network modularity. This measure is an order parameter (a dimensionless number between 0 and 1). Comparison is made to other approaches to module-discovery and to quantifying network modularity using Monte Carlo generated Erdos-like modular networks. Finally, the Network Information Bottleneck (NIB) algorithm is applied to a number of real world networks, including the social network of coauthors at the APS March Meeting 2004.
By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA. For all groups hairpin structures of RNA secondary structures are detected more frequently than expected for a random null hypothesis in noncoding rather than in coding regions. However, potential RNA secondary structures are also present in coding regions of dsDNA group. In fact we detect evolutionary conserved RNA secondary structures in conserved coding and noncoding regions of a large set of complete genomes of dsDNA herpesviruses.
Comparative transcriptomics has gained increasing popularity in genomic research thanks to the development of high-throughput technologies including microarray and next-generation RNA sequencing that have generated numerous transcriptomic data. An important question is to understand the conservation and differentiation of biological processes in different species. We propose a testing-based method TROM (Transcriptome Overlap Measure) for comparing transcriptomes within or between different species, and provide a different perspective to interpret transcriptomic similarity in contrast to traditional correlation analyses. Specifically, the TROM method focuses on identifying associated genes that capture molecular characteristics of biological samples, and subsequently comparing the biological samples by testing the overlap of their associated genes. We use simulation and real data studies to demonstrate that TROM is more powerful in identifying similar transcriptomes and more robust to stochastic gene expression noise than Pearson and Spearman correlations. We apply TROM to compare the developmental stages of six Drosophila species, C. elegans, S. purpuratus, D. rerio and mouse liver, and find interesting correspondence patterns that imply conserved gene expression programs in the development of these species. The TROM method is available as an R package on CRAN (http://cran.r-project.org/) with manuals and source codes available at http://www.stat.ucla.edu/ jingyi.li/software-and-data/trom.html.
Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, if class labels are inhomogeneously distributed between cohorts, their accuracy may drop. Flimma (https://exbio.wzw.tum.de/flimma/) addresses this issue by implementing the state-of-the-art workflow limma voom in a privacy-preserving manner, i.e. patient data never leaves its source site. Flimma results are identical to those generated by limma voom on combined datasets even in imbalanced scenarios where meta-analysis approaches fail.
Mathematical modelling and numerical simulations of interaction populations are crucial topics in systems biology. The interactions of ecological models may occur among individuals of the same species or individuals of different species. Describing the dynamics of such models occasionally requires some techniques of model analysis. Choosing appropriate techniques of model analysis is often a difficult task. We define a prey (mouse) and predator (cat) model. The system is modelled by a pair of non-linear ordinary differential equations using mass action law, under constant rates. A proper scaling is suggested to minimize the number of parameters. More interestingly, we propose a homotopy technique with n expanding parame- ters for finding some analytical approximate solutions. Furthermore, using the local sensitivity method is another important step forward in this study because it helps to identify critical model parameters. Numerical simulations are provided using Matlab for different parameters and initial conditions.