No Arabic abstract
We develop a cross-platform open-source Java application (BACOM2) with graphic user interface (GUI), and users also can use a XML file to set the parameters of algorithm model, file paths and the dataset of paired samples. BACOM2 implements the new entire pipeline of copy number change analysis for heterogeneous cancer tissues, including extraction of raw copy number signals from CEL files of paired samples, attenuation correction, identification of balanced AB-genotype loci, copy number detection and segmentation, global baseline calculation and absolute normalization, differentiation of deletion types, estimation of the normal tissue fraction and correction of normal tissue contamination. BACOM2 focuses on the common tools for data preparation and absolute normalization for copy number analysis of heterogeneous cancer tissues. The software provides an additional choice for scientists who require a user-friendly, high-speed processing, cross-platform computing environment for large copy number data analysis.
BACOM is a statistically principled and unsupervised method that detects copy number deletion types (homozygous versus heterozygous), estimates normal cell fraction, and recovers cancer specific copy number profiles, using allele specific copy number signals. In a subsequent analysis of TCGA ovarian cancer dataset, the average normal cell fraction estimated by BACOM was found higher than expected. In this letter, we first discuss the advantages of the BACOM in relation to alternative approaches. Then, we show that this elevated estimate of normal cell fraction is the combined result of inaccurate signal modeling and normalization. Lastly, we describe an allele specific signal modeling and normalization scheme that can enhance BACOM applications in many biological contexts. An open source MATLAB program was developed to implement our extended method and it is publically available.
Identifying subgroups and properties of cancer biopsy samples is a crucial step towards obtaining precise diagnoses and being able to perform personalized treatment of cancer patients. Recent data collections provide a comprehensive characterization of cancer cell data, including genetic data on copy number alterations (CNAs). We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach that encodes each cancer sample as a persistence diagram of topological features, i.e., high-dimensional voids represented in the data. We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data and demonstrate the viability of some applications on finding substructures in cancer data as well as comparing similarity of cancer types.
Gene expression data for a set of 12 localizations from The Cancer Genome Atlas are processed in order to evaluate an entropy-like magnitude allowing the characterization of tumors and comparison with the corresponding normal tissues. The comparison indicates that the number of available states in gene expression space is much greater for tumors than for normal tissues and points out to a scaling relation between the fraction of available states and the overlapping between the tumor and normal sample clouds.
Scripting languages are becoming more and more important as a tool for software development, as they provide great flexibility for rapid prototyping and for configuring componentware applications. In this paper we present LuaJava, a scripting tool for Java. LuaJava adopts Lua, a dynamically typed interpreted language, as its script language. Great emphasis is given to the transparency of the integration between the two languages, so that objects from one language can be used inside the other like native objects. The final result of this integration is a tool that allows the construction of configurable Java applications, using off-the-shelf components, in a high abstraction level.
We present a novel mathematical model of heterogeneous cell proliferation where the total population consists of a subpopulation of slow-proliferating cells and a subpopulation of fast-proliferating cells. The model incorporates two cellular processes, asymmetric cell division and induced switching between proliferative states, which are important determinants for the heterogeneity of a cell population. As motivation for our model we provide experimental data that illustrate the induced-switching process. Our model consists of a system of two coupled delay differential equations with distributed time delays and the cell densities as functions of time. The distributed delays are bounded and allow for the choice of delay kernel. We analyse the model and prove the non-negativity and boundedness of solutions, the existence and uniqueness of solutions, and the local stability characteristics of the equilibrium points. We find that the parameters for induced switching are bifurcation parameters and therefore determine the long-term behaviour of the model. Numerical simulations illustrate and support the theoretical findings, and demonstrate the primary importance of transient dynamics for understanding the evolution of many experimental cell populations.