No Arabic abstract
We propose the use of finite mixtures of continuous distributions in modelling the process by which new individuals, that arrive in groups, become part of a wildlife population. We demonstrate this approach using a data set of migrating semipalmated sandpipers (Calidris pussila) for which we extend existing stopover models to allow for individuals to have different behaviour in terms of their stopover duration at the site. We demonstrate the use of reversible jump MCMC methods to derive posterior distributions for the model parameters and the models, simultaneously. The algorithm moves between models with different numbers of arrival groups as well as between models with different numbers of behavioural groups. The approach is shown to provide new ecological insights about the stopover behaviour of semipalmated sandpipers but is generally applicable to any population in which animals arrive in groups and potentially exhibit heterogeneity in terms of one or more other processes.
With the wide adoption of functional magnetic resonance imaging (fMRI) by cognitive neuroscience researchers, large volumes of brain imaging data have been accumulated in recent years. Aggregating these data to derive scientific insights often faces the challenge that fMRI data are high-dimensional, heterogeneous across people, and noisy. These challenges demand the development of computational tools that are tailored both for the neuroscience questions and for the properties of the data. We review a few recently developed algorithms in various domains of fMRI research: fMRI in naturalistic tasks, analyzing full-brain functional connectivity, pattern classification, inferring representational similarity and modeling structured residuals. These algorithms all tackle the challenges in fMRI similarly: they start by making clear statements of assumptions about neural data and existing domain knowledge, incorporating those assumptions and domain knowledge into probabilistic graphical models, and using those models to estimate properties of interest or latent structures in the data. Such approaches can avoid erroneous findings, reduce the impact of noise, better utilize known properties of the data, and better aggregate data across groups of subjects. With these successful cases, we advocate wider adoption of explicit model construction in cognitive neuroscience. Although we focus on fMRI, the principle illustrated here is generally applicable to brain data of other modalities.
Item response theory (IRT) models have been widely used in educational measurement testing. When there are repeated observations available for individuals through time, a dynamic structure for the latent trait of ability needs to be incorporated into the model, to accommodate changes in ability. Other complications that often arise in such settings include a violation of the common assumption that test results are conditionally independent, given ability and item difficulty, and that test item difficulties may be partially specified, but subject to uncertainty. Focusing on time series dichotomous response data, a new class of state space models, called Dynamic Item Response (DIR) models, is proposed. The models can be applied either retrospectively to the full data or on-line, in cases where real-time prediction is needed. The models are studied through simulated examples and applied to a large collection of reading test data obtained from MetaMetrics, Inc.
We study a dynamic non-bipartite matching problem. There is a fixed set of agent types, and agents of a given type arrive and depart according to type-specific Poisson processes. Agent departures are not announced in advance. The value of a match is determined by the types of the matched agents. We present an online algorithm that is (1/8)-competitive with respect to the value of the optimal-in-hindsight policy, for arbitrary weighted graphs. Our algorithm treats agents heterogeneously, interpolating between immediate and delayed matching in order to thicken the market while still matching valuable agents opportunistically.
Transformed Generalized Autoregressive Moving Average (TGARMA) models were recently proposed to deal with non-additivity, non-normality and heteroscedasticity in real time series data. In this paper, a Bayesian approach is proposed for TGARMA models, thus extending the original model. We conducted a simulation study to investigate the performance of Bayesian estimation and Bayesian model selection criteria. In addition, a real dataset was analysed using the proposed approach.
We consider Bayesian high-dimensional mediation analysis to identify among a large set of correlated potential mediators the active ones that mediate the effect from an exposure variable to an outcome of interest. Correlations among mediators are commonly observed in modern data analysis; examples include the activated voxels within connected regions in brain image data, regulatory signals driven by gene networks in genome data and correlated exposure data from the same source. When correlations are present among active mediators, mediation analysis that fails to account for such correlation can be sub-optimal and may lead to a loss of power in identifying active mediators. Building upon a recent high-dimensional mediation analysis framework, we propose two Bayesian hierarchical models, one with a Gaussian mixture prior that enables correlated mediator selection and the other with a Potts mixture prior that accounts for the correlation among active mediators in mediation analysis. We develop efficient sampling algorithms for both methods. Various simulations demonstrate that our methods enable effective identification of correlated active mediators, which could be missed by using existing methods that assume prior independence among active mediators. The proposed methods are applied to the LIFECODES birth cohort and the Multi-Ethnic Study of Atherosclerosis (MESA) and identified new active mediators with important biological implications.