أوراق بحثية, رسائل ماجستير ودكتوراه حول نظرية الإحصاء

On the strong approximation and functional limit laws for the increments of the non-overlapping k-spacings processes

67 - Salim Bouzebda 2020

The first aim of the present paper, is to establish strong approximations of the uniform non-overlapping k-spacings process extending the results of Aly et al. (1984). Our methods rely on the invariance principle in Mason and van Zwet (1987). The sec ond goal, is to generalize the Dindar (1997) results for the increments of the spacings quantile process to the uniforme non-overlapping k-spacings quantile process. We apply the last result to characterize the limit laws of functionals of the increments k-spacings quantile process.

الاحتمالات نظرية الإحصاء نظرية الإحصاء

Stochastic analysis on Gaussian space applied to drift estimation

149 - Nicolas Privault , Anthony Reveillac 2018

In this paper we consider the nonparametric functional estimation of the drift of Gaussian processes using Paley-Wiener and Karhunen-Lo`eve expansions. We construct efficient estimators for the drift of such processes, and prove their minimaxity usin g Bayes estimators. We also construct superefficient estimators of Stein type for such drifts using the Malliavin integration by parts formula and stochastic analysis on Gaussian space, in which superharmonic functionals of the process paths play a particular role. Our results are illustrated by numerical simulations and extend the construction of James-Stein type estimators for Gaussian processes by Berger and Wolper.

نظرية الإحصاء الاحتمالات نظرية الإحصاء

On an Auxiliary Function for Log-Density Estimation

128 - Madeleine L. Cule , Lutz Duembgen 2016

In this note we provide explicit expressions and expansions for a special function which appears in nonparametric estimation of log-densities. This function returns the integral of a log-linear function on a simplex of arbitrary dimension. In particu lar it is used in the R-package LogCondDEAD by Cule et al. (2007).

حساب نظرية الإحصاء نظرية الإحصاء

Optimal cross-validation in density estimation with the $L^2$-loss

99 - Alain Celisse 2014

We analyze the performance of cross-validation (CV) in the density estimation framework with two purposes: (i) risk estimation and (ii) model selection. The main focus is given to the so-called leave-$p$-out CV procedure (Lpo), where $p$ denotes the cardinality of the test set. Closed-form expressions are settled for the Lpo estimator of the risk of projection estimators. These expressions provide a great improvement upon $V$-fold cross-validation in terms of variability and computational complexity. From a theoretical point of view, closed-form expressions also enable to study the Lpo performance in terms of risk estimation. The optimality of leave-one-out (Loo), that is Lpo with $p=1$, is proved among CV procedures used for risk estimation. Two model selection frameworks are also considered: estimation, as opposed to identification. For estimation with finite sample size $n$, optimality is achieved for $p$ large enough [with $p/n=o(1)$] to balance the overfitting resulting from the structure of the model collection. For identification, model selection consistency is settled for Lpo as long as $p/n$ is conveniently related to the rate of convergence of the best estimator in the collection: (i) $p/nto1$ as $nto+infty$ with a parametric rate, and (ii) $p/n=o(1)$ with some nonparametric estimators. These theoretical results are validated by simulation experiments.

نظرية الإحصاء نظرية الإحصاء

Perturbation-based inference for diffusion processes: Obtaining effective models from multiscale data

58 - Sebastian Krumscheid 2014

We consider the inference problem for parameters in stochastic differential equation models from discrete time observations (e.g. experimental or simulation data). Specifically, we study the case where one does not have access to observations of the model itself, but only to a perturbed version which converges weakly to the solution of the model. Motivated by this perturbation argument, we study the convergence of estimation procedures from a numerical analysis point of view. More precisely, we introduce appropriate consistency, stability, and convergence concepts and study their connection. It turns out that standard statistical techniques, such as the maximum likelihood estimator, are not convergent methodologies in this setting, since they fail to be stable. Due to this shortcoming, we introduce and analyse a novel inference procedure for parameters in stochastic differential equation models which turns out to be convergent. As such, the method is particularly suited for the estimation of parameters in effective (i.e. coarse-grained) models from observations of the corresponding multiscale process. We illustrate these theoretical findings via several numerical examples.

التحليل العددي نظرية الإحصاء نظرية الإحصاء

A new framework for extracting coarse-grained models from time series with multiscale structure

220 - Serafim Kalliadasis , Sebastian Krumscheid , Grigorios A.n Pavliotis 2014

In many applications it is desirable to infer coarse-grained models from observational data. The observed process often corresponds only to a few selected degrees of freedom of a high-dimensional dynamical system with multiple time scales. In this wo rk we consider the inference problem of identifying an appropriate coarse-grained model from a single time series of a multiscale system. It is known that estimators such as the maximum likelihood estimator or the quadratic variation of the path estimator can be strongly biased in this setting. Here we present a novel parametric inference methodology for problems with linear parameter dependency that does not suffer from this drawback. Furthermore, we demonstrate through a wide spectrum of examples that our methodology can be used to derive appropriate coarse-grained models from time series of partial observations of a multiscale system in an effective and systematic fashion.

نظرية الإحصاء نظرية الإحصاء

Efficiency of change point tests in high dimensional settings

62 - John A. D. Aston , Claudia Kirch 2014

While there is considerable work on change point analysis in univariate time series, more and more data being collected comes from high dimensional multivariate settings. This paper introduces the asymptotic concept of high dimensional efficiency whi ch quantifies the detection power of different statistics in such situations. While being related to classic asymptotic relative efficiency, it is different in that it provides the rate at which the change can get smaller with dimension while still being detectable. This also allows for comparisons of different methods with different null asymptotics as is for example the case in high-dimensional change point settings. Based on this new concept we investigate change point detection procedures using projections and develop asymptotic theory for how full panel (multivariate) tests compare with both oracle and random projections. Furthermore, for each given projection we can quantify a cone such that the corresponding projection statistic yields better power behavior if the true change direction is within this cone. The effect of misspecification of the covariance on the power of the tests is investigated, because in many high dimensional situations estimation of the full dependency (covariance) between the multivariate observations in the panel is often either computationally or even theoretically infeasible. It turns out that the projection statistic is much more robust in this respect in terms of size and somewhat more robust in terms of power. The theoretic quantification by the theory is accompanied by simulation results which confirm the theoretic (asymptotic) findings for surprisingly small samples. This shows in particular that the concept of high dimensional efficiency is indeed suitable to describe small sample power, and this is demonstrated in a multivariate example of market index data.

نظرية الإحصاء نظرية الإحصاء

Concentration Inequalities from Likelihood Ratio Method

171 - Xinjia Chen 2014

We explore the applications of our previously established likelihood-ratio method for deriving concentration inequalities for a wide variety of univariate and multivariate distributions. New concentration inequalities for various distributions are de veloped without the idea of minimizing moment generating functions.

نظرية الإحصاء الاحتمالات نظرية الإحصاء

Singular Value Shrinkage Priors for Bayesian Prediction

103 - Takeru Matsuda , Fumiyasu Komaki 2014

We develop singular value shrinkage priors for the mean matrix parameters in the matrix-variate normal model with known covariance matrices. Our priors are superharmonic and put more weight on matrices with smaller singular values. They are a natural generalization of the Stein prior. Bayes estimators and Bayesian predictive densities based on our priors are minimax and dominate those based on the uniform prior in finite samples. In particular, our priors work well when the true value of the parameter has low rank.

نظرية الإحصاء نظرية الإحصاء

Bayesian inference of cosmic density fields from non-linear, scale-dependent, and stochastic biased tracers

149 - Metin Ata , Francisco-Shu Kitaura , Volker Muller 2014

We present a Bayesian reconstruction algorithm to generate unbiased samples of the underlying dark matter field from halo catalogues. Our new contribution consists of implementing a non-Poisson likelihood including a deterministic non-linear and scal e-dependent bias. In particular we present the Hamiltonian equations of motions for the negative binomial (NB) probability distribution function. This permits us to efficiently sample the posterior distribution function of density fields given a sample of galaxies using the Hamiltonian Monte Carlo technique implemented in the Argo code. We have tested our algorithm with the Bolshoi $N$-body simulation at redshift $z = 0$, inferring the underlying dark matter density field from sub-samples of the halo catalogue with biases smaller and larger than one. Our method shows that we can draw closely unbiased samples (compatible within 1-$sigma$) from the posterior distribution up to scales of about $k$~1 h/Mpc in terms of power-spectra and cell-to-cell correlations. We find that a Poisson likelihood yields reconstructions with power spectra deviating more than 10% at $k$=0.2 h/Mpc. Our reconstruction algorithm is especially suited for emission line galaxy data for which a complex non-linear stochastic biasing treatment beyond Poissonity becomes indispensable.

علم الكونيات والفيزياء الفلكية Nongalactic نظرية الإحصاء نظرية الإحصاء

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد