No Arabic abstract
Recently there have been many research efforts in developing generative models for self-exciting point processes, partly due to their broad applicability for real-world applications. However, rarely can we quantify how well the generative model captures the nature or ground-truth since it is usually unknown. The challenge typically lies in the fact that the generative models typically provide, at most, good approximations to the ground-truth (e.g., through the rich representative power of neural networks), but they cannot be precisely the ground-truth. We thus cannot use the classic goodness-of-fit (GOF) test framework to evaluate their performance. In this paper, we develop a GOF test for generative models of self-exciting processes by making a new connection to this problem with the classical statistical theory of Quasi-maximum-likelihood estimator (QMLE). We present a non-parametric self-normalizing statistic for the GOF test: the Generalized Score (GS) statistics, and explicitly capture the model misspecification when establishing the asymptotic distribution of the GS statistic. Numerical simulation and real-data experiments validate our theory and demonstrate the proposed GS tests good performance.
In the common nonparametric regression model the problem of testing for a specific parametric form of the variance function is considered. Recently Dette and Hetzler (2008) proposed a test statistic, which is based on an empirical process of pseudo residuals. The process converges weakly to a Gaussian process with a complicated covariance kernel depending on the data generating process. In the present paper we consider a standardized version of this process and propose a martingale transform to obtain asymptotically distribution free tests for the corresponding Kolmogorov-Smirnov and Cram{e}r-von-Mises functionals. The finite sample properties of the proposed tests are investigated by means of a simulation study.
We propose and study a general method for construction of consistent statistical tests on the basis of possibly indirect, corrupted, or partially available observations. The class of tests devised in the paper contains Neymans smooth tests, data-driven score tests, and some types of multi-sample tests as basic examples. Our tests are data-driven and are additionally incorporated with model selection rules. The method allows to use a wide class of model selection rules that are based on the penalization idea. In particular, many of the optimal penalties, derived in statistical literature, can be used in our tests. We establish the behavior of model selection rules and data-driven tests under both the null hypothesis and the alternative hypothesis, derive an explicit detectability rule for alternative hypotheses, and prove a master consistency theorem for the tests from the class. The paper shows that the tests are applicable to a wide range of problems, including hypothesis testing in statistical inverse problems, multi-sample problems, and nonparametric hypothesis testing.
Statistical modeling plays a fundamental role in understanding the underlying mechanism of massive data (statistical inference) and predicting the future (statistical prediction). Although all models are wrong, researchers try their best to make some of them be useful. The question here is how can we measure the usefulness of a statistical model for the data in hand? This is key to statistical prediction. The important statistical problem of testing whether the observations follow the proposed statistical model has only attracted relatively few attentions. In this paper, we proposed a new framework for this problem through building its connection with two-sample distribution comparison. The proposed method can be applied to evaluate a wide range of models. Examples are given to show the performance of the proposed method.
The Ising model is one of the simplest and most famous models of interacting systems. It was originally proposed to model ferromagnetic interactions in statistical physics and is now widely used to model spatial processes in many areas such as ecology, sociology, and genetics, usually without testing its goodness of fit. Here, we propose various test statistics and an exact goodness-of-fit test for the finite-lattice Ising model. The theory of Markov bases has been developed in algebraic statistics for exact goodness-of-fit testing using a Monte Carlo approach. However, finding a Markov basis is often computationally intractable. Thus, we develop a Monte Carlo method for exact goodness-of-fit testing for the Ising model which avoids computing a Markov basis and also leads to a better connectivity of the Markov chain and hence to a faster convergence. We show how this method can be applied to analyze the spatial organization of receptors on the cell membrane.
In many fields, data appears in the form of direction (unit vector) and usual statistical procedures are not applicable to such directional data. In this study, we propose non-parametric goodness-of-fit testing procedures for general directional distributions based on kernel Stein discrepancy. Our method is based on Steins operator on spheres, which is derived by using Stokes theorem. Notably, the proposed method is applicable to distributions with an intractable normalization constant, which commonly appear in directional statistics. Experimental results demonstrate that the proposed methods control type-I error well and have larger power than existing tests, including the test based on the maximum mean discrepancy.