No Arabic abstract
In this contribution we are interested in proving that a given observation-driven model is identifiable. In the case of a GARCH(p, q) model, a simple sufficient condition has been established in [1] for showing the consistency of the quasi-maximum likelihood estimator. It turns out that this condition applies for a much larger class of observation-driven models, that we call the class of linearly observation-driven models. This class includes standard integer valued observation-driven time series, such as the log-linear Poisson GARCH or the NBIN-GARCH models.
We study the stochastic convergence of the Ces`{a}ro mean of a sequence of random variables. These arise naturally in statistical problems that have a sequential component, where the sequence of random variables is typically derived from a sequence of estimators computed on data. We show that establishing a rate of convergence in probability for a sequence is not sufficient in general to establish a rate in probability for its Ces`{a}ro mean. We also present several sets of conditions on the sequence of random variables that are sufficient to guarantee a rate of convergence for its Ces`{a}ro mean. We identify common settings in which these sets of conditions hold.
Optimal linear prediction (also known as kriging) of a random field ${Z(x)}_{xinmathcal{X}}$ indexed by a compact metric space $(mathcal{X},d_{mathcal{X}})$ can be obtained if the mean value function $mcolonmathcal{X}tomathbb{R}$ and the covariance function $varrhocolonmathcal{X}timesmathcal{X}tomathbb{R}$ of $Z$ are known. We consider the problem of predicting the value of $Z(x^*)$ at some location $x^*inmathcal{X}$ based on observations at locations ${x_j}_{j=1}^n$ which accumulate at $x^*$ as $ntoinfty$ (or, more generally, predicting $varphi(Z)$ based on ${varphi_j(Z)}_{j=1}^n$ for linear functionals $varphi, varphi_1, ldots, varphi_n$). Our main result characterizes the asymptotic performance of linear predictors (as $n$ increases) based on an incorrect second order structure $(tilde{m},tilde{varrho})$, without any restrictive assumptions on $varrho, tilde{varrho}$ such as stationarity. We, for the first time, provide necessary and sufficient conditions on $(tilde{m},tilde{varrho})$ for asymptotic optimality of the corresponding linear predictor holding uniformly with respect to $varphi$. These general results are illustrated by weakly stationary random fields on $mathcal{X}subsetmathbb{R}^d$ with Matern or periodic covariance functions, and on the sphere $mathcal{X}=mathbb{S}^2$ for the case of two isotropic covariance functions.
In the last decade, the secondary use of large data from health systems, such as electronic health records, has demonstrated great promise in advancing biomedical discoveries and improving clinical decision making. However, there is an increasing concern about biases in association studies caused by misclassification in the binary outcomes derived from electronic health records. We revisit the classical logistic regression model with misclassified outcomes. Despite that local identification conditions in some related settings have been previously established, the global identification of such models remains largely unknown and is an important question yet to be answered. We derive necessary and sufficient conditions for global identifiability of logistic regression models with misclassified outcomes, using a novel approach termed as the submodel analysis, and a technique adapted from the Picard-Lindel{o}f existence theorem in ordinary differential equations. In particular, our results are applicable to logistic models with discrete covariates, which is a common situation in biomedical studies, The conditions are easy to verify in practice. In addition to model identifiability, we propose a hypothesis testing procedure for regression coefficients in the misclassified logistic regression model when the model is not identifiable under the null.
The simplicial condition and other stronger conditions that imply it have recently played a central role in developing polynomial time algorithms with provable asymptotic consistency and sample complexity guarantees for topic estimation in separable topic models. Of these algorithms, those that rely solely on the simplicial condition are impractical while the practical ones need stronger conditions. In this paper, we demonstrate, for the first time, that the simplicial condition is a fundamental, algorithm-independent, information-theoretic necessary condition for consistent separable topic estimation. Furthermore, under solely the simplicial condition, we present a practical quadratic-complexity algorithm based on random projections which consistently detects all novel words of all topics using only up to second-order empirical word moments. This algorithm is amenable to distributed implementation making it attractive for big-data scenarios involving a network of large distributed databases.
We study parameter identifiability of directed Gaussian graphical models with one latent variable. In the scenario we consider, the latent variable is a confounder that forms a source node of the graph and is a parent to all other nodes, which correspond to the observed variables. We give a graphical condition that is sufficient for the Jacobian matrix of the parametrization map to be full rank, which entails that the parametrization is generically finite-to-one, a fact that is sometimes also referred to as local identifiability. We also derive a graphical condition that is necessary for such identifiability. Finally, we give a condition under which generic parameter identifiability can be determined from identifiability of a model associated with a subgraph. The power of these criteria is assessed via an exhaustive algebraic computational study on models with 4, 5, and 6 observable variables.