ترغب بنشر مسار تعليمي؟ اضغط هنا

Epidemiology of exposure to mixtures: we cant be casual about causality when using or testing methods

41   0   0.0 ( 0 )
 نشر من قبل Thomas Webster
 تاريخ النشر 2020
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Background: There is increasing interest in approaches for analyzing the effect of exposure mixtures on health. A key issue is how to simultaneously analyze often highly collinear components of the mixture, which can create problems such as confounding by co-exposure and co-exposure amplification bias (CAB). Evaluation of novel mixtures methods, typically using synthetic data, is critical to their ultimate utility. Objectives: This paper aims to answer two questions. How do causal models inform the interpretation of statistical models and the creation of synthetic data used to test them? Are novel mixtures methods susceptible to CAB? Methods: We use directed acyclic graphs (DAGs) and linear models to derive closed form solutions for model parameters to examine how underlying causal assumptions affect the interpretation of model results. Results: The same beta coefficients estimated by a statistical model can have different interpretations depending on the assumed causal structure. Similarly, the method used to simulate data can have implications for the underlying DAG (and vice versa), and therefore the identification of the parameter being estimated with an analytic approach. We demonstrate that methods that can reproduce results of linear regression, such as Bayesian kernel machine regression and the new quantile g-computation approach, will be subject to CAB. However, under some conditions, estimates of an overall effect of the mixture is not subject to CAB and even has reduced uncontrolled bias. Discussion: Just as DAGs encode a priori subject matter knowledge allowing identification of variable control needed to block analytic bias, we recommend explicitly identifying DAGs underlying synthetic data created to test statistical mixtures approaches. Estimates of the total effect of a mixture is an important but relatively underexplored topic that warrants further investigation.



قيم البحث

اقرأ أيضاً

198 - Ewan Cameron 2014
In astronomical and cosmological studies one often wishes to infer some properties of an infinite-dimensional field indexed within a finite-dimensional metric space given only a finite collection of noisy observational data. Bayesian inference offers an increasingly-popular strategy to overcome the inherent ill-posedness of this signal reconstruction challenge. However, there remains a great deal of confusion within the astronomical community regarding the appropriate mathematical devices for framing such analyses and the diversity of available computational procedures for recovering posterior functionals. In this brief research note I will attempt to clarify both these issues from an applied statistics perpective, with insights garnered from my post-astronomy experiences as a computational Bayesian / epidemiological geostatistician.
269 - Siyu Heng , Bo Zhang , Xu Han 2019
Instrumental variables (IVs) are extensively used to estimate treatment effects when the treatment and outcome are confounded by unmeasured confounders; however, weak IVs are often encountered in empirical studies and may cause problems. Many studies have considered building a stronger IV from the original, possibly weak, IV in the design stage of a matched study at the cost of not using some of the samples in the analysis. It is widely accepted that strengthening an IV tends to render nonparametric tests more powerful and will increase the power of sensitivity analyses in large samples. In this article, we re-evaluate this conventional wisdom to bring new insights into this topic. We consider matched observational studies from three perspectives. First, we evaluate the trade-off between IV strength and sample size on nonparametric tests assuming the IV is valid and exhibit conditions under which strengthening an IV increases power and conversely conditions under which it decreases power. Second, we derive a necessary condition for a valid sensitivity analysis model with continuous doses. We show that the $Gamma$ sensitivity analysis model, which has been previously used to come to the conclusion that strengthening an IV increases the power of sensitivity analyses in large samples, does not apply to the continuous IV setting and thus this previously reached conclusion may be invalid. Third, we quantify the bias of the Wald estimator with a possibly invalid IV under an oracle and leverage it to develop a valid sensitivity analysis framework; under this framework, we show that strengthening an IV may amplify or mitigate the bias of the estimator, and may or may not increase the power of sensitivity analyses. We also discuss how to better adjust for the observed covariates when building an IV in matched studies.
We study causality between bivariate curve time series using the Granger causality generalized measures of correlation. With this measure, we can investigate which curve time series Granger-causes the other; in turn, it helps determine the predictabi lity of any two curve time series. Illustrated by a climatology example, we find that the sea surface temperature Granger-causes the sea-level atmospheric pressure. Motivated by a portfolio management application in finance, we single out those stocks that lead or lag behind Dow-Jones industrial averages. Given a close relationship between S&P 500 index and crude oil price, we determine the leading and lagging variables.
High-dimensional feature selection is a central problem in a variety of application domains such as machine learning, image analysis, and genomics. In this paper, we propose graph-based tests as a useful basis for feature selection. We describe an al gorithm for selecting informative features in high-dimensional data, where each observation comes from one of $K$ different distributions. Our algorithm can be applied in a completely nonparametric setup without any distributional assumptions on the data, and it aims at outputting those features in the data, that contribute the most to the overall distributional variation. At the heart of our method is the recursive application of distribution-free graph-based tests on subsets of the feature set, located at different depths of a hierarchical clustering tree constructed from the data. Our algorithm recovers all truly contributing features with high probability, while ensuring optimal control on false-discovery. Finally, we show the superior performance of our method over other existing ones through synthetic data, and also demonstrate the utility of the method on a real-life dataset from the domain of climate change.
The popularity of online surveys has increased the prominence of using weights that capture units probabilities of inclusion for claims of representativeness. Yet, much uncertainty remains regarding how these weights should be employed in the analysi s of survey experiments: Should they be used or ignored? If they are used, which estimators are preferred? We offer practical advice, rooted in the Neyman-Rubin model, for researchers producing and working with survey experimental data. We examine simple, efficient estimators for analyzing these data, and give formulae for their biases and variances. We provide simulations that examine these estimators as well as real examples from experiments administered online through YouGov. We find that for examining the existence of population treatment effects using high-quality, broadly representative samples recruited by top online survey firms, sample quantities, which do not rely on weights, are often sufficient. We found that Sample Average Treatment Effect (SATE) estimates did not appear to differ substantially from their weighted counterparts, and they avoided the substantial loss of statistical power that accompanies weighting. When precise estimates of Population Average Treatment Effects (PATE) are essential, we analytically show post-stratifying on survey weights and/or covariates highly correlated with the outcome to be a conservative choice. While we show these substantial gains in simulations, we find limited evidence of them in practice.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا