No Arabic abstract
A large class of problems in sciences and engineering can be formulated as the general problem of constructing random intervals with pre-specified coverage probabilities for the mean. Wee propose a general approach for statistical inference of mean values based on accumulated observational data. We show that the construction of such random intervals can be accomplished by comparing the endpoints of random intervals with confidence sequences for the mean. Asymptotic results are obtained for such sequential methods.
Assuming that data are collected sequentially from independent streams, we consider the simultaneous testing of multiple binary hypotheses under two general setups; when the number of signals (correct alternatives) is known in advance, and when we only have a lower and an upper bound for it. In each of these setups, we propose feasible procedures that control, without any distributional assumptions, the familywise error probabilities of both type I and type II below given, user-specified levels. Then, in the case of i.i.d. observations in each stream, we show that the proposed procedures achieve the optimal expected sample size, under every possible signal configuration, asymptotically as the two error probabilities vanish at arbitrary rates. A simulation study is presented in a completely symmetric case and supports insights obtained from our asymptotic results, such as the fact that knowledge of the exact number of signals roughly halves the expected number of observations compared to the case of no prior information.
In this paper, we develop a computational approach for estimating the mean value of a quantity in the presence of uncertainty. We demonstrate that, under some mild assumptions, the upper and lower bounds of the mean value are efficiently computable via a sample reuse technique, of which the computational complexity is shown to posses a Poisson distribution.
We study the least squares estimator in the residual variance estimation context. We show that the mean squared differences of paired observations are asymptotically normally distributed. We further establish that, by regressing the mean squared differences of these paired observations on the squared distances between paired covariates via a simple least squares procedure, the resulting variance estimator is not only asymptotically normal and root-$n$ consistent, but also reaches the optimal bound in terms of estimation variance. We also demonstrate the advantage of the least squares estimator in comparison with existing methods in terms of the second order asymptotic properties.
We revisit the problem of estimating the mean of a real-valued distribution, presenting a novel estimator with sub-Gaussian convergence: intuitively, our estimator, on any distribution, is as accurate as the sample mean is for the Gaussian distribution of matching variance. Crucially, in contrast to prior works, our estimator does not require prior knowledge of the variance, and works across the entire gamut of distributions with bounded variance, including those without any higher moments. Parameterized by the sample size $n$, the failure probability $delta$, and the variance $sigma^2$, our estimator is accurate to within $sigmacdot(1+o(1))sqrt{frac{2logfrac{1}{delta}}{n}}$, tight up to the $1+o(1)$ factor. Our estimator construction and analysis gives a framework generalizable to other problems, tightly analyzing a sum of dependent random variables by viewing the sum implicitly as a 2-parameter $psi$-estimator, and constructing bounds using mathematical programming and duality techniques.
In this paper, we have established a unified framework of multistage parameter estimation. We demonstrate that a wide variety of statistical problems such as fixed-sample-size interval estimation, point estimation with error control, bounded-width confidence intervals, interval estimation following hypothesis testing, construction of confidence sequences, can be cast into the general framework of constructing sequential random intervals with prescribed coverage probabilities. We have developed exact methods for the construction of such sequential random intervals in the context of multistage sampling. In particular, we have established inclusion principle and coverage tuning techniques to control and adjust the coverage probabilities of sequential random intervals. We have obtained concrete sampling schemes which are unprecedentedly efficient in terms of sampling effort as compared to existing procedures.