ترغب بنشر مسار تعليمي؟ اضغط هنا

A Cross-validated Ensemble Approach to Robust Hypothesis Testing of Continuous Nonlinear Interactions: Application to Nutrition-Environment Studies

118   0   0.0 ( 0 )
 نشر من قبل Jeremiah Zhe Liu
 تاريخ النشر 2019
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Gene-environment and nutrition-environment studies often involve testing of high-dimensional interactions between two sets of variables, each having potentially complex nonlinear main effects on an outcome. Construction of a valid and powerful hypothesis test for such an interaction is challenging, due to the difficulty in constructing an efficient and unbiased estimator for the complex, nonlinear main effects. In this work we address this problem by proposing a Cross-validated Ensemble of Kernels (CVEK) that learns the space of appropriate functions for the main effects using a cross-validated ensemble approach. With a carefully chosen library of base kernels, CVEK flexibly estimates the form of the main-effect functions from the data, and encourages test power by guarding against over-fitting under the alternative. The method is motivated by a study on the interaction between metal exposures in utero and maternal nutrition on childrens neurodevelopment in rural Bangladesh. The proposed tests identified evidence of an interaction between minerals and vitamins intake and arsenic and manganese exposures. Results suggest that the detrimental effects of these metals are most pronounced at low intake levels of the nutrients, suggesting nutritional interventions in pregnant women could mitigate the adverse impacts of in utero metal exposures on childrens neurodevelopment.



قيم البحث

اقرأ أيضاً

The extraction of nonstationary signals from blind and semi-blind multivariate observations is a recurrent problem. Numerous algorithms have been developed for this problem, which are based on the exact or approximate joint diagonalization of second or higher order cumulant matrices/tensors of multichannel data. While a great body of research has been dedicated to joint diagonalization algorithms, the selection of the diagonalized matrix/tensor set remains highly problem-specific. Herein, various methods for nonstationarity identification are reviewed and a new general framework based on hypothesis testing is proposed, which results in a classification/clustering perspective to semi-blind source separation of nonstationary components. The proposed method is applied to noninvasive fetal ECG extraction, as case study.
102 - Yinan Lin , Zhenhua Lin 2021
We develop a unified approach to hypothesis testing for various types of widely used functional linear models, such as scalar-on-function, function-on-function and function-on-scalar models. In addition, the proposed test applies to models of mixed t ypes, such as models with both functional and scalar predictors. In contrast with most existing methods that rest on the large-sample distributions of test statistics, the proposed method leverages the technique of bootstrapping max statistics and exploits the variance decay property that is an inherent feature of functional data, to improve the empirical power of tests especially when the sample size is limited and the signal is relatively weak. Theoretical guarantees on the validity and consistency of the proposed test are provided uniformly for a class of test statistics.
Persistent homology is a vital tool for topological data analysis. Previous work has developed some statistical estimators for characteristics of collections of persistence diagrams. However, tools that provide statistical inference for observations that are persistence diagrams are limited. Specifically, there is a need for tests that can assess the strength of evidence against a claim that two samples arise from the same population or process. We propose the use of randomization-style null hypothesis significance tests (NHST) for these situations. The test is based on a loss function that comprises pairwise distances between the elements of each sample and all the elements in the other sample. We use this method to analyze a range of simulated and experimental data. Through these examples we experimentally explore the power of the p-values. Our results show that the randomization-style NHST based on pairwise distances can distinguish between samples from different processes, which suggests that its use for hypothesis tests upon persistence diagrams is reasonable. We demonstrate its application on a real dataset of fMRI data of patients with ADHD.
We describe the utility of point processes and failure rates and the most common point process for modeling failure rates, the Poisson point process. Next, we describe the uniformly most powerful test for comparing the rates of two Poisson point proc esses for a one-sided test (henceforth referred to as the rate test). A common argument against using this test is that real world data rarely follows the Poisson point process. We thus investigate what happens when the distributional assumptions of tests like these are violated and the test still applied. We find a non-pathological example (using the rate test on a Compound Poisson distribution with Binomial compounding) where violating the distributional assumptions of the rate test make it perform better (lower error rates). We also find that if we replace the distribution of the test statistic under the null hypothesis with any other arbitrary distribution, the performance of the test (described in terms of the false negative rate to false positive rate trade-off) remains exactly the same. Next, we compare the performance of the rate test to a version of the Wald test customized to the Negative Binomial point process and find it to perform very similarly while being much more general and versatile. Finally, we discuss the applications to Microsoft Azure. The code for all experiments performed is open source and linked in the introduction.
Recently, in the context of covariance matrix estimation, in order to improve as well as to regularize the performance of the Tylers estimator [1] also called the Fixed-Point Estimator (FPE) [2], a shrinkage fixed-point estimator has been introduced in [3]. First, this work extends the results of [3,4] by giving the general solution of the shrinkage fixed-point algorithm. Secondly, by analyzing this solution, called the generalized robust shrinkage estimator, we prove that this solution converges to a unique solution when the shrinkage parameter $beta$ (losing factor) tends to 0. This solution is exactly the FPE with the trace of its inverse equal to the dimension of the problem. This general result allows one to give another interpretation of the FPE and more generally, on the Maximum Likelihood approach for covariance matrix estimation when constraints are added. Then, some simulations illustrate our theoretical results as well as the way to choose an optimal shrinkage factor. Finally, this work is applied to a Space-Time Adaptive Processing (STAP) detection problem on real STAP data.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا