ترغب بنشر مسار تعليمي؟ اضغط هنا

On the estimators of autocorrelation model parameters

107   0   0.0 ( 0 )
 نشر من قبل Chris Fleming
 تاريخ النشر 2013
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Estimation of autocorrelations and spectral densities is of fundamental importance in many fields of science, from identifying pulsar signals in astronomy to measuring heart beats in medicine. In circumstances where one is interested in specific autocorrelation functions that do not fit into any simple families of models, such as auto-regressive moving average (ARMA), estimating model parameters is generally approached in one of two ways: by fitting the model autocorrelation function to a non-parameteric autocorrelation estimate via regression analysis or by fitting the model autocorrelation function directly to the data via maximum likelihood. Prior literature suggests that variogram regression yields parameter estimates of comparable quality to maximum likelihood. In this letter we demonstrate that, as sample size is increases, the accuracy of the maximum-likelihood estimates (MLE) ultimately improves by orders of magnitude beyond that of variogram regression. For relatively continuous and Gaussian processes, this improvement can occur for sample sizes of less than 100. Moreover, even where the accuracy of these methods is comparable, the MLE remains almost universally better and, more critically, variogram regression does not provide reliable confidence intervals. Inaccurate regression parameter estimates are typically accompanied by underestimated standard errors, whereas likelihood provides reliable confidence intervals.

قيم البحث

اقرأ أيضاً

Physical or geographic location proves to be an important feature in many data science models, because many diverse natural and social phenomenon have a spatial component. Spatial autocorrelation measures the extent to which locally adjacent observat ions of the same phenomenon are correlated. Although statistics like Morans $I$ and Gearys $C$ are widely used to measure spatial autocorrelation, they are slow: all popular methods run in $Omega(n^2)$ time, rendering them unusable for large data sets, or long time-courses with moderate numbers of points. We propose a new $S_A$ statistic based on the notion that the variance observed when merging pairs of nearby clusters should increase slowly for spatially autocorrelated variables. We give a linear-time algorithm to calculate $S_A$ for a variable with an input agglomeration order (available at https://github.com/aamgalan/spatial_autocorrelation). For a typical dataset of $n approx 63,000$ points, our $S_A$ autocorrelation measure can be computed in 1 second, versus 2 hours or more for Morans $I$ and Gearys $C$. Through simulation studies, we demonstrate that $S_A$ identifies spatial correlations in variables generated with spatially-dependent model half an order of magnitude earlier than either Morans $I$ or Gearys $C$. Finally, we prove several theoretical properties of $S_A$: namely that it behaves as a true correlation statistic, and is invariant under addition or multiplication by a constant.
A robust estimator is proposed for the parameters that characterize the linear regression problem. It is based on the notion of shrinkages, often used in Finance and previously studied for outlier detection in multivariate data. A thorough simulation study is conducted to investigate: the efficiency with normal and heavy-tailed errors, the robustness under contamination, the computational times, the affine equivariance and breakdown value of the regression estimator. Two classical data-sets often used in the literature and a real socio-economic data-set about the Living Environment Deprivation of areas in Liverpool (UK), are studied. The results from the simulations and the real data examples show the advantages of the proposed robust estimator in regression.
Bootstrap smoothed (bagged) estimators have been proposed as an improvement on estimators found after preliminary data-based model selection. Efron, 2014, derived a widely applicable formula for a delta method approximation to the standard deviation of the bootstrap smoothed estimator. He also considered a confidence interval centered on the bootstrap smoothed estimator, with width proportional to the estimate of this standard deviation. Kabaila and Wijethunga, 2019, assessed the performance of this confidence interval in the scenario of two nested linear regression models, the full model and a simpler model, for the case of known error variance and preliminary model selection using a hypothesis test. They found that the performance of this confidence interval was not substantially better than the usual confidence interval based on the full model, with the same minimum coverage. We extend this assessment to the case of unknown error variance by deriving a computationally convenient exact formula for the ideal (i.e. in the limit as the number of bootstrap replications diverges to infinity) delta method approximation to the standard deviation of the bootstrap smoothed estimator. Our results show that, unlike the known error variance case, there are circumstances in which this confidence interval has attractive properties.
Bootstrap smoothed (bagged) parameter estimators have been proposed as an improvement on estimators found after preliminary data-based model selection. The key result of Efron (2014) is a very convenient and widely applicable formula for a delta meth od approximation to the standard deviation of the bootstrap smoothed estimator. This approximation provides an easily computed guide to the accuracy of this estimator. In addition, Efron (2014) proposed a confidence interval centered on the bootstrap smoothed estimator, with width proportional to the estimate of this approximation to the standard deviation. We evaluate this confidence interval in the scenario of two nested linear regression models, the full model and a simpler model, and a preliminary test of the null hypothesis that the simpler model is correct. We derive computationally convenient expressions for the ideal bootstrap smoothed estimator and the coverage probability and expected length of this confidence interval. In terms of coverage probability, this confidence interval outperforms the post-model-selection confidence interval with the same nominal coverage and based on the same preliminary test. We also compare the performance of confidence interval centered on the bootstrap smoothed estimator, in terms of expected length, to the usual confidence interval, with the same minimum coverage probablility, based on the full model.
The problem of Voodoo correlations is recognized in neuroimaging as the problem of estimating quantities of interest from the same data that was used to select them as interesting. In statistical terminology, the problem of inference following select ion from the same data is that of selective inference. Motivated by the unwelcome side-effects of the recommended remedy- splitting the data. A method for constructing confidence intervals based on the correct post-selection distribution of the observations has been suggested recently. We utilize a similar approach in order to provide point estimates that account for a large part of the selection bias. We show via extensive simulations that the proposed estimator has favorable properties, namely, that it is likely to reduce estimation bias and the mean squared error compared to the direct estimator without sacrificing power to detect non-zero correlation as in the case of the data splitting approach. We show that both point estimates and confidence intervals are needed in order to get a full assessment of the uncertainty in the point estimates as both are integrated into the Confidence Calibration Plots proposed recently. The computation of the estimators is implemented in an accompanying software package.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا