أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Minh Tang

A semiparametric two-sample hypothesis testing problem for random dot product graphs

90 - Minh Tang , Avanti Athreya , Daniel L. Sussman 2014

Two-sample hypothesis testing for random graphs arises naturally in neuroscience, social networks, and machine learning. In this paper, we consider a semiparametric problem of two-sample hypothesis testing for a class of latent position random graphs . We formulate a notion of consistency in this context and propose a valid test for the hypothesis that two finite-dimensional random dot product graphs on a common vertex set have the same generating latent positions or have generating latent positions that are scaled or diagonal transformations of one another. Our test statistic is a function of a spectral decomposition of the adjacency matrix for each graph and our test procedure is consistent across a broad range of alternatives. We apply our test procedure to real biological data: in a test-retest data set of neural connectome graphs, we are able to distinguish between scans from different subjects; and in the {em C.elegans} connectome, we are able to distinguish between chemical and electrical networks. The latter example is a concrete demonstration that our test can have power even for small sample sizes. We conclude by discussing the relationship between our test procedure and generalized likelihood ratio tests.

المنهجية

Generalized Canonical Correlation Analysis for Classification

83 - Cencheng Shen , Ming Sun , Minh Tang 2013

For multiple multivariate data sets, we derive conditions under which Generalized Canonical Correlation Analysis (GCCA) improves classification performance of the projected datasets, compared to standard Canonical Correlation Analysis (CCA) using onl y two data sets. We illustrate our theoretical results with simulations and a real data experiment.

التعلم الالي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد