No Arabic abstract
Disaster recovery is widely regarded as the least understood phase of the disaster cycle. In particular, the literature around lifeline infrastructure restoration modeling frequently mentions the lack of empirical quantitative data available. Despite limitations, there is a growing body of research on modeling lifeline infrastructure restoration, often developed using empirical quantitative data. This study reviews this body of literature and identifies the data collection and usage patterns present across modeling approaches to inform future efforts using empirical quantitative data. We classify the modeling approaches into simulation, optimization, and statistical modeling. The number of publications in this domain has increased over time with the most rapid growth of statistical modeling. Electricity infrastructure restoration is most frequently modeled, followed by the restoration of multiple infrastructures, water infrastructure, and transportation infrastructure. Interdependency between multiple infrastructures is increasingly considered in recent literature. Researchers gather the data from a variety of sources, including collaborations with utility companies, national databases, and post-event damage and restoration reports. This study provides discussion and recommendations around data usage practices within the lifeline restoration modeling field. Following the recommendations would facilitate the development of a community of practice around restoration modeling and provide greater opportunities for future data sharing.
Atmospheric modeling has recently experienced a surge with the advent of deep learning. Most of these models, however, predict concentrations of pollutants following a data-driven approach in which the physical laws that govern their behaviors and relationships remain hidden. With the aid of real-world air quality data collected hourly in different stations throughout Madrid, we present an empirical approach using data-driven techniques with the following goals: (1) Find parsimonious systems of ordinary differential equations via sparse identification of nonlinear dynamics (SINDy) that model the concentration of pollutants and their changes over time; (2) assess the performance and limitations of our models using stability analysis; (3) reconstruct the time series of chemical pollutants not measured in certain stations using delay coordinate embedding results. Our results show that Akaikes Information Criterion can work well in conjunction with best subset regression as to find an equilibrium between sparsity and goodness of fit. We also find that, due to the complexity of the chemical system under study, identifying the dynamics of this system over longer periods of time require higher levels of data filtering and smoothing. Stability analysis for the reconstructed ordinary differential equations (ODEs) reveals that more than half of the physically relevant critical points are saddle points, suggesting that the system is unstable even under the idealized assumption that all environmental conditions are constant over time.
Quantitatively predicting phenotype variables by the expression changes in a set of candidate genes is of great interest in molecular biology but it is also a challenging task for several reasons. First, the collected biological observations might be heterogeneous and correspond to different biological mechanisms. Secondly, the gene expression variables used to predict the phenotype are potentially highly correlated since genes interact though unknown regulatory networks. In this paper, we present a novel approach designed to predict quantitative trait from transcriptomic data, taking into account the heterogeneity in biological samples and the hidden gene regulatory networks underlying different biological mechanisms. The proposed model performs well on prediction but it is also fully parametric, which facilitates the downstream biological interpretation. The model provides clusters of individuals based on the relation between gene expression data and the phenotype, and also leads to infer a gene regulatory network specific for each cluster of individuals. We perform numerical simulations to demonstrate that our model is competitive with other prediction models, and we demonstrate the predictive performance and the interpretability of our model to predict alcohol sensitivity from transcriptomic data on real data from Drosophila Melanogaster Genetic Reference Panel (DGRP).
Active matter comprises individual units that convert energy into mechanical motion. In many examples, such as bacterial systems and biofilament assays, constituent units are elongated and can give rise to local nematic orientational order. Such `active nematics systems have attracted much attention from both theorists and experimentalists. However, despite intense research efforts, data-driven quantitative modeling has not been achieved, a situation mainly due to the lack of systematic experimental data and to the large number of parameters of current models. Here we introduce a new active nematics system made of swarming filamentous bacteria. We simultaneously measure orientation and velocity fields and show that the complex spatiotemporal dynamics of our system can be quantitatively reproduced by a new type of microscopic model for active suspensions whose important parameters are all estimated from comprehensive experimental data. This provides unprecedented access to key effective parameters and mechanisms governing active nematics. Our approach is applicable to different types of dense suspensions and shows a path towards more quantitative active matter research.
Recent technology breakthrough in spatial molecular profiling has enabled the comprehensive molecular characterizations of single cells while preserving spatial information. It provides new opportunities to delineate how cells from different origins form tissues with distinctive structures and functions. One immediate question in analysis of spatial molecular profiling data is how to identify spatially variable genes. Most of the current methods build upon the geostatistical model with a Gaussian process that relies on selecting ad hoc kernels to account for spatial expression patterns. To overcome this potential challenge and capture more types of spatial patterns, we introduce a Bayesian approach to identify spatially variable genes via Ising model. The key idea is to use the energy interaction parameter of the Ising model to characterize spatial expression patterns. We use auxiliary variable Markov chain Monte Carlo algorithms to sample from the posterior distribution with an intractable normalizing constant in the Ising model. Simulation results show that our energy-based modeling approach led to higher accuracy in detecting spatially variable genes than those kernel-based methods. Applying our method to two real spatial transcriptomics datasets, we discovered novel spatial patterns that shed light on the biological mechanisms. The proposed method presents a new perspective for analyzing spatial transcriptomics data.
While it is well known that high levels of prenatal alcohol exposure (PAE) result in significant cognitive deficits in children, the exact nature of the dose response is less well understood. In particular, there is a pressing need to identify the levels of PAE associated with an increased risk of clinically significant adverse effects. To address this issue, data have been combined from six longitudinal birth cohort studies in the United States that assessed the effects of PAE on cognitive outcomes measured from early school age through adolescence. Structural equation models (SEMs) are commonly used to capture the association among multiple observed outcomes in order to characterise the underlying variable of interest (in this case, cognition) and then relate it to PAE. However, it was not possible to apply classic SEM software in our context because different outcomes were measured in the six studies. In this paper we show how a Bayesian approach can be used to fit a multi-group multi-level structural model that maps cognition to a broad range of observed variables measured at multiple ages. These variables map to several different cognitive subdomains and are examined in relation to PAE after adjusting for confounding using propensity scores. The model also tests the possibility of a change point in the dose-response function.