No Arabic abstract
Modularity is a popular metric for quantifying the degree of community structure within a network. The distribution of the largest eigenvalue of a networks edge weight or adjacency matrix is well studied and is frequently used as a substitute for modularity when performing statistical inference. However, we show that the largest eigenvalue and modularity are asymptotically uncorrelated, which suggests the need for inference directly on modularity itself when the network size is large. To this end, we derive the asymptotic distributions of modularity in the case where the networks edge weight matrix belongs to the Gaussian Orthogonal Ensemble, and study the statistical power of the corresponding test for community structure under some alternative model. We empirically explore universality extensions of the limiting distribution and demonstrate the accuracy of these asymptotic distributions through type I error simulations. We also compare the empirical powers of the modularity based tests with some existing methods. Our method is then used to test for the presence of community structure in two real data applications.
In this paper we present a novel method for estimating the parameters of a parametric diffusion processes. Our approach is based on a closed-form Maximum Likelihood estimator for an approximating Continuous Time Markov Chain (CTMC) of the diffusion process. Unlike typical time discretization approaches, such as psuedo-likelihood approximations with Shoji-Ozaki or Kesslers method, the CTMC approximation introduces no time-discretization error during parameter estimation, and is thus well-suited for typical econometric situations with infrequently sampled data. Due to the structure of the CTMC, we are able to obtain closed-form approximations for the sample likelihood which hold for general univariate diffusions. Comparisons of the state-discretization approach with approximate MLE (time-discretization) and Exact MLE (when applicable) demonstrate favorable performance of the CMTC estimator. Simulated examples are provided in addition to real data experiments with FX rates and constant maturity interest rates.
We investigate the asymptotic behavior of several variants of the scan statistic applied to empirical distributions, which can be applied to detect the presence of an anomalous interval with any length. Of particular interest is Studentized scan statistic that is preferable in practice. The main ingredients in the proof are Kolmogorovs theorem, a Poisson approximation, and recent technical results by Kabluchko et al (2014).
We characterize completely the Gneiting class of space-time covariance functions and give more relaxed conditions on the involved functions. We then show necessary conditions for the construction of compactly supported functions of the Gneiting type. These conditions are very general since they do not depend on the Euclidean norm. Finally, we discuss a general class of positive definite functions, used for multivariate Gaussian random fields. For this class, we show necessary criteria for its generator to be compactly supported.
This paper introduces a Nearly Unstable INteger-valued AutoRegressive Conditional Heteroskedasticity (NU-INARCH) process for dealing with count time series data. It is proved that a proper normalization of the NU-INARCH process endowed with a Skorohod topology weakly converges to a Cox-Ingersoll-Ross diffusion. The asymptotic distribution of the conditional least squares estimator of the correlation parameter is established as a functional of certain stochastic integrals. Numerical experiments based on Monte Carlo simulations are provided to verify the behavior of the asymptotic distribution under finite samples. These simulations reveal that the nearly unstable approach provides satisfactory and better results than those based on the stationarity assumption even when the true process is not that close to non-stationarity. A unit root test is proposed and its Type-I error and power are examined via Monte Carlo simulations. As an illustration, the proposed methodology is applied to the daily number of deaths due to COVID-19 in the United Kingdom.
Stochastic models of interacting populations have crucial roles in scientific fields such as epidemiology and ecology, yet the standard approach to extending an ordinary differential equation model to a Markov chain does not have sufficient flexibility in the mean-variance relationship to match data (e.g. cite{bjornstad2001noisy}). A previous theory on time-homogeneous dynamics over a single arrow by cite{breto2011compound} showed how gamma white noise could be used to construct certain over-dispersed Markov chains, leading to widely used models (e.g. cite{breto2009time,he2010plug}). In this paper, we define systemic infinitesimal over-dispersion, developing theory and methodology for general time-inhomogeneous stochastic graphical models. Our approach, based on Dirichlet noise, leads to a new class of Markov models over general direct graphs. It is compatible with modern likelihood-based inference methodologies (e.g. cite{ionides2006inference,ionides2015inference,king2008inapparent}) and therefore we can assess how well the new models fit data. We demonstrate our methodology on a widely analyzed measles dataset, adding Dirichlet noise to a classical SEIR (Susceptible-Exposed-Infected-Recovered) model. We find that the proposed methodology has higher log-likelihood than the gamma white noise approach, and the resulting parameter estimations provide new insights into the over-dispersion of this biological system.