No Arabic abstract
Identifying directed interactions between species from time series of their population densities has many uses in ecology. This key statistical task is equivalent to causal time series inference, which connects to the Granger causality (GC) concept: $x$ causes $y$ if $x$ improves the prediction of $y$ in a dynamic model. However, the entangled nature of nonlinear ecological systems has led to question the appropriateness of Granger causality, especially in its classical linear Multivariate AutoRegressive (MAR) model form. Convergent-cross mapping (CCM), a nonparametric method developed for deterministic dynamical systems, has been suggested as an alternative. Here, we show that linear GC and CCM are able to uncover interactions with surprisingly similar performance, for predator-prey cycles, 2-species deterministic (chaotic) or stochastic competition, as well as 10- and 20-species interaction networks. There is no correspondence between the degree of nonlinearity of the dynamics and which method performs best. Our results therefore imply that Granger causality, even in its linear MAR($p$) formulation, is a valid method for inferring interactions in nonlinear ecological networks; using GC or CCM (or both) can instead be decided based on the aims and specifics of the analysis.
Continuous, automated surveillance systems that incorporate machine learning models are becoming increasingly more common in healthcare environments. These models can capture temporally dependent changes across multiple patient variables and can enhance a clinicians situational awareness by providing an early warning alarm of an impending adverse event such as sepsis. However, most commonly used methods, e.g., XGBoost, fail to provide an interpretable mechanism for understanding why a model produced a sepsis alarm at a given time. The black-box nature of many models is a severe limitation as it prevents clinicians from independently corroborating those physiologic features that have contributed to the sepsis alarm. To overcome this limitation, we propose a generalized linear model (GLM) approach to fit a Granger causal graph based on the physiology of several major sepsis-associated derangements (SADs). We adopt a recently developed stochastic monotone variational inequality-based estimator coupled with forwarding feature selection to learn the graph structure from both continuous and discrete-valued as well as regularly and irregularly sampled time series. Most importantly, we develop a non-asymptotic upper bound on the estimation error for any monotone link function in the GLM. We conduct real-data experiments and demonstrate that our proposed method can achieve comparable performance to popular and powerful prediction methods such as XGBoost while simultaneously maintaining a high level of interpretability.
In the study of complex physical and biological systems represented by multivariate stochastic processes, an issue of great relevance is the description of the system dynamics spanning multiple temporal scales. While methods to assess the dynamic complexity of individual processes at different time scales are well-established, multiscale analysis of directed interactions has never been formalized theoretically, and empirical evaluations are complicated by practical issues such as filtering and downsampling. Here we extend the very popular measure of Granger causality (GC), a prominent tool for assessing directed lagged interactions between joint processes, to quantify information transfer across multiple time scales. We show that the multiscale processing of a vector autoregressive (AR) process introduces a moving average (MA) component, and describe how to represent the resulting ARMA process using state space (SS) models and to combine the SS model parameters for computing exact GC values at arbitrarily large time scales. We exploit the theoretical formulation to identify peculiar features of multiscale GC in basic AR processes, and demonstrate with numerical simulations the much larger estimation accuracy of the SS approach compared with pure AR modeling of filtered and downsampled data. The improved computational reliability is exploited to disclose meaningful multiscale patterns of information transfer between global temperature and carbon dioxide concentration time series, both in paleoclimate and in recent years.
Granger causality is a statistical notion of causal influence based on prediction via vector autoregression. For Gaussian variables it is equivalent to transfer entropy, an information-theoretic measure of time-directed information transfer between jointly dependent processes. We exploit such equivalence and calculate exactly the local Granger causality, i.e. the profile of the information transfer at each discrete time point in Gaussian processes; in this frame Granger causality is the average of its local version. Our approach offers a robust and computationally fast method to follow the information transfer along the time history of linear stochastic processes, as well as of nonlinear complex systems studied in the Gaussian approximation.
Population dynamics of a competitive two-species system under the influence of random events are analyzed and expressions for the steady-state population mean, fluctuations, and cross-correlation of the two species are presented. It is shown that random events cause the population mean of each specie to make smooth transition from far above to far below of its growth rate threshold. At the same time, the population mean of the weaker specie never reaches the extinction point. It is also shown that, as a result of competition, the relative population fluctuations do not die out as the growth rates of both species are raised far above their respective thresholds. This behavior is most remarkable at the maximum competition point where the weaker species population statistics becomes completely chaotic regardless of how far its growth rate in raised.
The analysis of eight molecular datasets involving human and teleost examples along with morphological samples from several groups of Neotropical electric fish (Order: Gymnotiformes) were used in this thesis to test the dynamics of both intraspecific variation and interspecific diversity. In terms of investigating molecular interspecific diversity among humans, two experimental exercises were performed. A cladistic exchange experiment tested for the extent of discontinuity and interbreeding between H. sapiens and neanderthal populations. As part of the same question, another experimental exercise tested the amount of molecular variance resulting from simulations which treated neanderthals as being either a local population of modern humans or as a distinct subspecies. Finally, comparisons of hominid populations over time with fish species helped to define what constitutes taxonomically relevant differences between morphological populations as expressed among both trait size ranges and through growth patterns that begin during ontogeny. Compared to the subdivision found within selected teleost species, H. sapiens molecular data exhibited little variation and discontinuity between geographical regions. Results of the two experimental exercises concluded that neanderthals exhibit taxonomic distance from modern H. sapiens. However, this distance was not so great as to exclude the possibility of interbreeding between the two subspecific groups. Finally, a series of characters were analyzed among species of Neotropical electric fish. These analyses were compared with hominid examples to determine what constituted taxonomically relevant differences between populations as expressed among specific morphometric traits that develop during the juvenile phase.