بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Graph Independence Testing

88 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jes\\'us Daniel Arroyo Reli\\'on

تاريخ النشر 2019

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Junhao Xiong - Cencheng Shen - Jesus Arroyo

المنهجية تطبيقات الإحصاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Identifying statistically significant dependency between variables is a key step in scientific discoveries. Many recent methods, such as distance and kernel tests, have been proposed for valid and consistent independence testing and can be applied to data in Euclidean and non-Euclidean spaces. However, in those works, $n$ pairs of points in $mathcal{X} times mathcal{Y}$ are observed. Here, we consider the setting where a pair of $n times n$ graphs are observed, and the corresponding adjacency matrices are treated as kernel matrices. Under a $rho$-correlated stochastic block model, we demonstrate that a naive test (permutation and Pearsons) for a conditional dependency graph model is invalid. Instead, we propose a block-permutation procedure. We prove that our procedure is valid and consistent -- even when the two graphs have different marginal distributions, are weighted or unweighted, and the latent vertex assignments are unknown -- and provide sufficient conditions for the tests to estimate $rho$. Simulations corroborate these results on both binary and weighted graphs. Applying these tests to the whole-organism, single-cell-resolution structural connectomes of C. elegans, we identify strong statistical dependency between the chemical synapse connectome and the gap junction connectome.

قيم البحث

61 - David S. Watson , Marvin N. Wright 2019

We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Cand`es et al. (2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss function. The CPI can be efficiently computed for high-dimensional data without any sparsity constraints. We demonstrate convergence criteria for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be applied in causal discovery to identify underlying multivariate graph structures. We test our method using various algorithms, including linear regression, neural networks, random forests, and support vector machines. Empirical results show that the CPI compares favorably to alternative variable importance measures and other nonparametric tests of conditional independence on a diverse array of real and simulated datasets. Simulations confirm that our inference procedures successfully control Type I error and achieve nominal coverage probability. Our method has been implemented in an R package, cpi, which can be downloaded from https://github.com/dswatson/cpi.

المنهجية التعلم الآلي التعلم الالي

Kernel Two-Sample and Independence Tests for Non-Stationary Random Processes

180 - Felix Laumann , Julius von Kugelgen , Mauricio Barahona 2020

Two-sample and independence tests with the kernel-based MMD and HSIC have shown remarkable results on i.i.d. data and stationary random processes. However, these statistics are not directly applicable to non-stationary random processes, a prevalent f orm of data in many scientific disciplines. In this work, we extend the application of MMD and HSIC to non-stationary settings by assuming access to independent realisations of the underlying random process. These realisations - in the form of non-stationary time-series measured on the same temporal grid - can then be viewed as i.i.d. samples from a multivariate probability distribution, to which MMD and HSIC can be applied. We further show how to choose suitable kernels over these high-dimensional spaces by maximising the estimated test power with respect to the kernel hyper-parameters. In experiments on synthetic data, we demonstrate superior performance of our proposed approaches in terms of test power when compared to current state-of-the-art functional or multivariate two-sample and independence tests. Finally, we employ our methods on a real socio-economic dataset as an example application.

المنهجية تطبيقات الإحصاء

Efficient Multiple Testing Adjustment for Hierarchical Inference

127 - Claude Renaux , Peter Buhlmann 2021

Hierarchical inference in (generalized) regression problems is powerful for finding significant groups or even single covariates, especially in high-dimensional settings where identifiability of the entire regression parameter vector may be ill-posed . The general method proceeds in a fully data-driven and adaptive way from large to small groups or singletons of covariates, depending on the signal strength and the correlation structure of the design matrix. We propose a novel hierarchical multiple testing adjustment that can be used in combination with any significance test for a group of covariates to perform hierarchical inference. Our adjustment passes on the significance level of certain hypotheses that could not be rejected and is shown to guarantee strong control of the familywise error rate. Our method is at least as powerful as a so-called depth-wise hierarchical Bonferroni adjustment. It provides a substantial gain in power over other previously proposed inheritance hierarchical procedures if the underlying alternative hypotheses occur sparsely along a few branches in the tree-structured hierarchy.

المنهجية تطبيقات الإحصاء

Assurance for sample size determination in reliability demonstration testing

80 - Kevin James Wilson , Malcolm Farrow (School of Mathematics , n Statistics & Physics 2019

Manufacturers are required to demonstrate products meet reliability targets. A typical way to achieve this is with reliability demonstration tests (RDTs), in which a number of products are put on test and the test is passed if a target reliability is achieved. There are various methods for determining the sample size for RDTs, typically based on the power of a hypothesis test following the RDT or risk criteria. Bayesian risk criteria approaches can conflate the choice of sample size and the analysis to be undertaken once the test has been conducted and rely on the specification of somewhat artificial acceptable and rejectable reliability levels. In this paper we offer an alternative approach to sample size determination based on the idea of assurance. This approach chooses the sample size to answer provide a certain probability that the RDT will result in a successful outcome. It separates the design and analysis of the RDT, allowing different priors for each. We develop the assurance approach for sample size calculations in RDTs for binomial and Weibull likelihoods and propose appropriate prior distributions for the design and analysis of the test. In each case, we illustrate the approach with an example based on real data.

المنهجية تطبيقات الإحصاء

Exponential-Family Random Graph Models for Rank-Order Relational Data

358 - Pavel N. Krivitsky , n University of Wollongong 2012

Rank-order relational data, in which each actor ranks the others according to some criterion, often arise from sociometric measurements of judgment (e.g., self-reported interpersonal interaction) or preference (e.g., relative liking). We propose a cl ass of exponential-family models for rank-order relational data and derive a new class of sufficient statistics for such data, which assume no more than within-subject ordinal properties. Application of MCMC MLE to this family allows us to estimate effects for a variety of plausible mechanisms governing rank structure in cross-sectional context, and to model the evolution of such structures over time. We apply this framework to model the evolution of relative liking judgments in an acquaintance process, and to model recall of relative volume of interpersonal interaction among members of a technology education program.

المنهجية تطبيقات الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

معهد تكنولوجيا المعلومات ITI

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Graph Independence Testing

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً