No Arabic abstract
A principal component analysis of the TCGA data for 15 cancer localizations unveils the following qualitative facts about tumors: 1) The state of a tissue in gene expression space may be described by a few variables. In particular, there is a single variable describing the progression from a normal tissue to a tumor. 2) Each cancer localization is characterized by a gene expression profile, in which genes have specific weights in the definition of the cancer state. There are no less than 2500 differentially-expressed genes, which lead to power-like tails in the expression distribution functions. 3) Tumors in different localizations share hundreds or even thousands of differentially expressed genes. There are 6 genes common to the 15 studied tumor localizations. 4) The tumor region is a kind of attractor. Tumors in advanced stages converge to this region independently of patient age or genetic variability. 5) There is a landscape of cancer in gene expression space with an approximate border separating normal tissues from tumors.
Gene expression data for a set of 12 localizations from The Cancer Genome Atlas are processed in order to evaluate an entropy-like magnitude allowing the characterization of tumors and comparison with the corresponding normal tissues. The comparison indicates that the number of available states in gene expression space is much greater for tumors than for normal tissues and points out to a scaling relation between the fraction of available states and the overlapping between the tumor and normal sample clouds.
Complex biological functions are carried out by the interaction of genes and proteins. Uncovering the gene regulation network behind a function is one of the central themes in biology. Typically, it involves extensive experiments of genetics, biochemistry and molecular biology. In this paper, we show that much of the inference task can be accomplished by a deep neural network (DNN), a form of machine learning or artificial intelligence. Specifically, the DNN learns from the dynamics of the gene expression. The learnt DNN behaves like an accurate simulator of the system, on which one can perform in-silico experiments to reveal the underlying gene network. We demonstrate the method with two examples: biochemical adaptation and the gap-gene patterning in fruit fly embryogenesis. In the first example, the DNN can successfully find the two basic network motifs for adaptation - the negative feedback and the incoherent feed-forward. In the second and much more complex example, the DNN can accurately predict behaviors of essentially all the mutants. Furthermore, the regulation network it uncovers is strikingly similar to the one inferred from experiments. In doing so, we develop methods for deciphering the gene regulation network hidden in the DNN black box. Our interpretable DNN approach should have broad applications in genotype-phenotype mapping.
In unicellular organisms such as bacteria the same acquired mutations beneficial in one environment can be restrictive in another. However, evolving Escherichia coli populations demonstrate remarkable flexibility in adaptation. The mechanisms sustaining genetic flexibility remain unclear. In E. coli the transcriptional regulation of gene expression involves both dedicated regulators binding specific DNA sites with high affinity and also global regulators - abundant DNA architectural proteins of the bacterial chromoid binding multiple low affinity sites and thus modulating the superhelical density of DNA. The first form of transcriptional regulation is dominantly pairwise and specific, representing digitial control, while the second form is (in strength and distribution) continuous, representing analog control. Here we look at the properties of effective networks derived from significant gene expression changes under variation of the two forms of control and find that upon limitations of one type of control (caused e.g. by mutation of a global DNA architectural factor) the other type can compensate for compromised regulation. Mutations of global regulators significantly enhance the digital control; in the presence of global DNA architectural proteins regulation is mostly of the analog type, coupling spatially neighboring genomic loci; together our data suggest that two logically distinct types of control are balancing each other. By revealing two distinct logical types of control, our approach provides basic insights into both the organizational principles of transcriptional regulation and the mechanisms buffering genetic flexibility. We anticipate that the general concept of distinguishing logical types of control will apply to many complex biological networks.
In this review we summarize our recent efforts in trying to understand the role of heterogeneity in cancer progression by using neural networks to characterise different aspects of the mapping from a cancer cells genotype and environment to its phenotype. Our central premise is that cancer is an evolving system subject to mutation and selection, and the primary conduit for these processes to occur is the cancer cell whose behaviour is regulated on multiple biological scales. The selection pressure is mainly driven by the microenvironment that the tumour is growing in and this acts directly upon the cell phenotype. In turn, the phenotype is driven by the intracellular pathways that are regulated by the genotype. Integrating all of these processes is a massive undertaking and requires bridging many biological scales (i.e. genotype, pathway, phenotype and environment) that we will only scratch the surface of in this review. We will focus on models that use neural networks as a means of connecting these different biological scales, since they allow us to easily create heterogeneity for selection to act upon and importantly this heterogeneity can be implemented at different biological scales. More specifically, we consider three different neural networks that bridge different aspects of these scales and the dialogue with the micro-environment, (i) the impact of the micro-environment on evolutionary dynamics, (ii) the mapping from genotype to phenotype under drug-induced perturbations and (iii) pathway activity in both normal and cancer cells under different micro-environmental conditions.
Environmental and genetic mutations can transform the cells in a co-operating healthy tissue into an ecosystem of individualistic tumour cells that compete for space and resources. Various selection forces are responsible for driving the evolution of cells in a tumour towards more malignant and aggressive phenotypes that tend to have a fitness advantage over the older populations. Although the evolutionary nature of cancer has been recognised for more than three decades (ever since the seminal work of Nowell) it has been only recently that tools traditionally used by ecological and evolutionary researchers have been adopted to study the evolution of cancer phenotypes in populations of individuals capable of co-operation and competition. In this chapter we will describe game theory as an important tool to study the emergence of cell phenotypes in a tumour and will critically review some of its applications in cancer research. These applications demonstrate that game theory can be used to understand the dynamics of somatic cancer evolution and suggest new therapies in which this knowledge could be applied to gain some control over the evolution of the tumour.