No Arabic abstract
Angiogenesis is the process by which blood vessels form from pre-existing vessels. It plays a key role in many biological processes, including embryonic development and wound healing, and contributes to many diseases including cancer and rheumatoid arthritis. The structure of the resulting vessel networks determines their ability to deliver nutrients and remove waste products from biological tissues. Here we simulate the Anderson-Chaplain model of angiogenesis at different parameter values and quantify the vessel architectures of the resulting synthetic data. Specifically, we propose a topological data analysis (TDA) pipeline for systematic analysis of the model. TDA is a vibrant and relatively new field of computational mathematics for studying the shape of data. We compute topological and standard descriptors of model simulations generated by different parameter values. We show that TDA of model simulation data stratifies parameter space into regions with similar vessel morphology. The methodologies proposed here are widely applicable to other synthetic and experimental data including wound healing, development, and plant biology.
In this paper, a new stochastic framework for parameter estimation and uncertainty quantification in colon cancer-induced angiogenesis, using patient data, is presented. The dynamics of colon cancer is given by a stochastic process that captures the inherent randomness in the system. The stochastic framework is based on the Fokker-Planck equation that represents the evolution of the probability density function corresponding to the stochastic process. An optimization problem is formulated that takes input individual patient data with randomness present, and is solved to obtain the unknown parameters corresponding to the individual tumor characteristics. Furthermore, sensitivity analysis of the optimal parameter set is performed to determine the parameters that need to be controlled, thus, providing information of the type of drugs that can be used for treatment.
The recent explosion of genomic data has underscored the need for interpretable and comprehensive analyses that can capture complex phylogenetic relationships within and across species. Recombination, reassortment and horizontal gene transfer constitute examples of pervasive biological phenomena that cannot be captured by tree-like representations. Starting from hundreds of genomes, we are interested in the reconstruction of potential evolutionary histories leading to the observed data. Ancestral recombination graphs represent potential histories that explicitly accommodate recombination and mutation events across orthologous genomes. However, they are computationally costly to reconstruct, usually being infeasible for more than few tens of genomes. Recently, Topological Data Analysis (TDA) methods have been proposed as robust and scalable methods that can capture the genetic scale and frequency of recombination. We build upon previous TDA developments for detecting and quantifying recombination, and present a novel framework that can be applied to hundreds of genomes and can be interpreted in terms of minimal histories of mutation and recombination events, quantifying the scales and identifying the genomic locations of recombinations. We implement this framework in a software package, called TARGet, and apply it to several examples, including small migration between different populations, human recombination, and horizontal evolution in finches inhabiting the Galapagos Islands.
To support and guide an extensive experimental research into systems biology of signaling pathways, increasingly more mechanistic models are being developed with hopes of gaining further insight into biological processes. In order to analyse these models, computational and statistical techniques are needed to estimate the unknown kinetic parameters. This chapter reviews methods from frequentist and Bayesian statistics for estimation of parameters and for choosing which model is best for modeling the underlying system. Approximate Bayesian Computation (ABC) techniques are introduced and employed to explore different hypothesis about the JAK-STAT signaling pathway.
Bayesian inference methods rely on numerical algorithms for both model selection and parameter inference. In general, these algorithms require a high computational effort to yield reliable estimates. One of the major challenges in phylogenetics is the estimation of the marginal likelihood. This quantity is commonly used for comparing different evolutionary models, but its calculation, even for simple models, incurs high computational cost. Another interesting challenge relates to the estimation of the posterior distribution. Often, long Markov chains are required to get sufficient samples to carry out parameter inference, especially for tree distributions. In general, these problems are addressed separately by using different procedures. Nested sampling (NS) is a Bayesian computation algorithm which provides the means to estimate marginal likelihoods together with their uncertainties, and to sample from the posterior distribution at no extra cost. The methods currently used in phylogenetics for marginal likelihood estimation lack in practicality due to their dependence on many tuning parameters and the inability of most implementations to provide a direct way to calculate the uncertainties associated with the estimates. To address these issues, we introduce NS to phylogenetics. Its performance is assessed under different scenarios and compared to established methods. We conclude that NS is a competitive and attractive algorithm for phylogenetic inference. An implementation is available as a package for BEAST 2 under the LGPL licence, accessible at https://github.com/BEAST2-Dev/nested-sampling.
The technology to generate Spatially Resolved Transcriptomics (SRT) data is rapidly being improved and applied to investigate a variety of biological tissues. The ability to interrogate how spatially localised gene expression can lend new insight to different tissue development is critical, but the appropriate tools to analyse this data are still emerging. This chapter reviews available packages and pipelines for the analysis of different SRT datasets with a focus on identifying spatially variable genes (SVGs) alongside other aims, while discussing the importance of and challenges in establishing a standardised ground truth in the biological data for benchmarking.