Revisiting the random shift approach for testing in spatial statistics

67 0 0.0 ( 0 )

Download Cite

Added by Tom\\'a\\v{s} Mrkvi\\v{c}ka

Publication date 2019

fields Mathematical Statistics

and research's language is English

Authors Tomas Mrkvicka - Jiri Dvorak - Jonatan A. Gonzalez

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We consider the problem of non-parametric testing of independence of two components of a stationary bivariate spatial process. In particular, we revisit the random shift approach that has become a standard method for testing the independent superposition hypothesis in spatial statistics, and it is widely used in a plethora of practical applications. However, this method has a problem of liberality caused by breaking the marginal spatial correlation structure due to the toroidal correction. This indeed causes that the assumption of exchangability, which is essential for the Monte Carlo test to be exact, is not fulfilled. We present a number of permutation strategies and show that the random shift with the variance correction brings a suitable improvement compared to the torus correction in the random field case. It reduces the liberality and achieves the largest power from all investigated variants. To obtain the variance for the variance correction method, several approaches were studied. The best results were achieved, for the sample covariance as the test statistics, with the correction factor $1/n$. This corresponds to the asymptotic order of the variance of the test statistics. In the point process case, the problem of deviations from exchangeability is far more complex and we propose an alternative strategy based on the mean cross nearest-neighbor distance and torus correction. It reduces the liberality but achieves slightly lower power than the usual cross $K$-function. Therefore we recommend it, when the point patterns are clustered, where the cross $K$-function achieves liberality.

rate research

Spatial Statistics

68 - Noel Cressie , Matthew T. Moores 2021

Spatial statistics is an area of study devoted to the statistical analysis of data that have a spatial label associated with them. Geographers often refer to the location information associated with the attribute information, whose study defines a research area called spatial analysis. Many of the ways to manipulate spatial data are driven by algorithms with no uncertainty quantification associated with them. When a spatial analysis is statistical, that is, it incorporates uncertainty quantification, it falls in the research area called spatial statistics. The primary feature of spatial statistical models is that nearby attribute values are more statistically dependent than distant attribute values; this is a paraphrasing of what is sometimes called the First Law of Geography (Tobler, 1970).

Methodology

Missing at Random or Not: A Semiparametric Testing Approach

103 - Rui Duan , C. Jason Liang , Pamela Shaw 2020

Practical problems with missing data are common, and statistical methods have been developed concerning the validity and/or efficiency of statistical procedures. On a central focus, there have been longstanding interests on the mechanism governing data missingness, and correctly deciding the appropriate mechanism is crucially relevant for conducting proper practical investigations. The conventional notions include the three common potential classes -- missing completely at random, missing at random, and missing not at random. In this paper, we present a new hypothesis testing approach for deciding between missing at random and missing not at random. Since the potential alternatives of missing at random are broad, we focus our investigation on a general class of models with instrumental variables for data missing not at random. Our setting is broadly applicable, thanks to that the model concerning the missing data is nonparametric, requiring no explicit model specification for the data missingness. The foundational idea is to develop appropriate discrepancy measures between estimators whose properties significantly differ only when missing at random does not hold. We show that our new hypothesis testing approach achieves an objective data oriented choice between missing at random or not. We demonstrate the feasibility, validity, and efficacy of the new test by theoretical analysis, simulation studies, and a real data analysis.

Methodology Econometrics

Testing normality using the summary statistics with application to meta-analysis

74 - Dehui Luo , Xiang Wan , Jiming Liu 2018

As the most important tool to provide high-level evidence-based medicine, researchers can statistically summarize and combine data from multiple studies by conducting meta-analysis. In meta-analysis, mean differences are frequently used effect size measurements to deal with continuous data, such as the Cohens d statistic and Hedges g statistic values. To calculate the mean difference based effect sizes, the sample mean and standard deviation are two essential summary measures. However, many of the clinical reports tend not to directly record the sample mean and standard deviation. Instead, the sample size, median, minimum and maximum values and/or the first and third quartiles are reported. As a result, researchers have to transform the reported information to the sample mean and standard deviation for further compute the effect size. Since most of the popular transformation methods were developed upon the normality assumption of the underlying data, it is necessary to perform a pre-test before transforming the summary statistics. In this article, we had introduced test statistics for three popular scenarios in meta-analysis. We suggests medical researchers to perform a normality test of the selected studies before using them to conduct further analysis. Moreover, we applied three different case studies to demonstrate the usage of the newly proposed test statistics. The real data case studies indicate that the new test statistics are easy to apply in practice and by following the recommended path to conduct the meta-analysis, researchers can obtain more reliable conclusions.

Methodology

A Random Forest Approach for Modeling Bounded Outcomes

70 - Leonie Weinhold , Matthias Schmid , Marvin N. Wright 2019

Random forests have become an established tool for classification and regression, in particular in high-dimensional settings and in the presence of complex predictor-response relationships. For bounded outcome variables restricted to the unit interval, however, classical random forest approaches may severely suffer as they do not account for the heteroscedasticity in the data. A random forest approach is proposed for relating beta distributed outcomes to explanatory variables. The approach explicitly makes use of the likelihood function of the beta distribution for the selection of splits during the tree-building procedure. In each iteration of the tree-building algorithm one chooses the combination of explanatory variable and splitting rule that maximizes the log-likelihood function of the beta distribution with the parameter estimates derived from the nodes of the currently built tree. Several simulation studies demonstrate the properties of the method and compare its performance to classical random forest approaches as well as to parametric regression models.

Methodology

A Unified Approach to Hypothesis Testing for Functional Linear Models

102 - Yinan Lin , Zhenhua Lin 2021

We develop a unified approach to hypothesis testing for various types of widely used functional linear models, such as scalar-on-function, function-on-function and function-on-scalar models. In addition, the proposed test applies to models of mixed types, such as models with both functional and scalar predictors. In contrast with most existing methods that rest on the large-sample distributions of test statistics, the proposed method leverages the technique of bootstrapping max statistics and exploits the variance decay property that is an inherent feature of functional data, to improve the empirical power of tests especially when the sample size is limited and the signal is relatively weak. Theoretical guarantees on the validity and consistency of the proposed test are provided uniformly for a class of test statistics.

Methodology Statistics Theory Statistics Theory