No Arabic abstract
For testing two random vectors for independence, we consider testing whether the distance of one vector from a center point is independent from the distance of the other vector from a center point by a univariate test. In this paper we provide conditions under which it is enough to have a consistent univariate test of independence on the distances to guarantee that the power to detect dependence between the random vectors increases to one, as the sample size increases. These conditions turn out to be minimal. If the univariate test is distribution-free, the multivariate test will also be distribution-free. If we consider multiple center points and aggregate the center-specific univariate tests, the power may be further improved, and the resulting multivariate test may be distribution-free for specific aggregation methods (if the univariate test is distribution-free). We show that several multivariate tests recently proposed in the literature can be viewed as instances of this general approach.
The analysis of record-breaking events is of interest in fields such as climatology, hydrology, economy or sports. In connection with the record occurrence, we propose three distribution-free statistics for the changepoint detection problem. They are CUSUM-type statistics based on the upper and/or lower record indicators which occur in a series. Using a version of the functional central limit theorem, we show that the CUSUM-type statistics are asymptotically Kolmogorov distributed. The main results under the null hypothesis are based on series of independent and identically distributed random variables, but a statistic to deal with series with seasonal component and serial correlation is also proposed. A Monte Carlo study of size, power and changepoint estimate has been performed. Finally, the methods are illustrated by analyzing the time series of temperatures at Madrid, Spain. The $textsf{R}$ package $texttt{RecordTest}$ publicly available on CRAN implements the proposed methods.
We consider the problem of testing whether pairs of univariate random variables are associated. Few tests of independence exist that are consistent against all dependent alternatives and are distribution free. We propose novel tests that are consistent, distribution free, and have excellent power properties. The tests have simple form, and are surprisingly computationally efficient thanks to accompanying innovative algorithms we develop. Moreover, we show that one of the test statistics is a consistent estimator of the mutual information. We demonstrate the good power properties in simulations, and apply the tests to a microarray study where many pairs of genes are examined simultaneously for co-dependence.
The assumption of separability of the covariance operator for a random image or hypersurface can be of substantial use in applications, especially in situations where the accurate estimation of the full covariance structure is unfeasible, either for computational reasons, or due to a small sample size. However, inferential tools to verify this assumption are somewhat lacking in high-dimensional or functional {data analysis} settings, where this assumption is most relevant. We propose here to test separability by focusing on $K$-dimensional projections of the difference between the covariance operator and a nonparametric separable approximation. The subspace we project onto is one generated by the eigenfunctions of the covariance operator estimated under the separability hypothesis, negating the need to ever estimate the full non-separable covariance. We show that the rescaled difference of the sample covariance operator with its separable approximation is asymptotically Gaussian. As a by-product of this result, we derive asymptotically pivotal tests under Gaussian assumptions, and propose bootstrap methods for approximating the distribution of the test statistics. We probe the finite sample performance through simulations studies, and present an application to log-spectrogram images from a phonetic linguistics dataset.
Objective Bayesian inference procedures are derived for the parameters of the multivariate random effects model generalized to elliptically contoured distributions. The posterior for the overall mean vector and the between-study covariance matrix is deduced by assigning two noninformative priors to the model parameter, namely the Berger and Bernardo reference prior and the Jeffreys prior, whose analytical expressions are obtained under weak distributional assumptions. It is shown that the only condition needed for the posterior to be proper is that the sample size is larger than the dimension of the data-generating model, independently of the class of elliptically contoured distributions used in the definition of the generalized multivariate random effects model. The theoretical findings of the paper are applied to real data consisting of ten studies about the effectiveness of hypertension treatment for reducing blood pressure where the treatment effects on both the systolic blood pressure and diastolic blood pressure are investigated.
We propose a novel approach to the analysis of covariance operators making use of concentration inequalities. First, non-asymptotic confidence sets are constructed for such operators. Then, subsequent applications including a k sample test for equality of covariance, a functional data classifier, and an expectation-maximization style clustering algorithm are derived and tested on both simulated and phoneme data.