No Arabic abstract
In this paper, we have established a unified framework of multistage parameter estimation. We demonstrate that a wide variety of statistical problems such as fixed-sample-size interval estimation, point estimation with error control, bounded-width confidence intervals, interval estimation following hypothesis testing, construction of confidence sequences, can be cast into the general framework of constructing sequential random intervals with prescribed coverage probabilities. We have developed exact methods for the construction of such sequential random intervals in the context of multistage sampling. In particular, we have established inclusion principle and coverage tuning techniques to control and adjust the coverage probabilities of sequential random intervals. We have obtained concrete sampling schemes which are unprecedentedly efficient in terms of sampling effort as compared to existing procedures.
In this paper, we have established a general framework of multistage hypothesis tests which applies to arbitrarily many mutually exclusive and exhaustive composite hypotheses. Within the new framework, we have constructed specific multistage tests which rigorously control the risk of committing decision errors and are more efficient than previous tests in terms of average sample number and the number of sampling operations. Without truncation, the sample numbers of our testing plans are absolutely bounded.
We first review existing sequential methods for estimating a binomial proportion. Afterward, we propose a new family of group sequential sampling schemes for estimating a binomial proportion with prescribed margin of error and confidence level. In particular, we establish the uniform controllability of coverage probability and the asymptotic optimality for such a family of sampling schemes. Our theoretical results establish the possibility that the parameters of this family of sampling schemes can be determined so that the prescribed level of confidence is guaranteed with little waste of samples. Analytic bounds for the cumulative distribution functions and expectations of sample numbers are derived. Moreover, we discuss the inherent connection of various sampling schemes. Numerical issues are addressed for improving the accuracy and efficiency of computation. Computational experiments are conducted for comparing sampling schemes. Illustrative examples are given for applications in clinical trials.
In this paper, we develop a multistage approach for estimating the mean of a bounded variable. We first focus on the multistage estimation of a binomial parameter and then generalize the estimation methods to the case of general bounded random variables. A fundamental connection between a binomial parameter and the mean of a bounded variable is established. Our multistage estimation methods rigorously guarantee prescribed levels of precision and confidence.
In this article, we derive a new generalization of Chebyshev inequality for random vectors. We demonstrate that the new generalization is much less conservative than the classical generalization.
The spectral gap $gamma$ of a finite, ergodic, and reversible Markov chain is an important parameter measuring the asymptotic rate of convergence. In applications, the transition matrix $P$ may be unknown, yet one sample of the chain up to a fixed time $n$ may be observed. We consider here the problem of estimating $gamma$ from this data. Let $pi$ be the stationary distribution of $P$, and $pi_star = min_x pi(x)$. We show that if $n = tilde{O}bigl(frac{1}{gamma pi_star}bigr)$, then $gamma$ can be estimated to within multiplicative constants with high probability. When $pi$ is uniform on $d$ states, this matches (up to logarithmic correction) a lower bound of $tilde{Omega}bigl(frac{d}{gamma}bigr)$ steps required for precise estimation of $gamma$. Moreover, we provide the first procedure for computing a fully data-dependent interval, from a single finite-length trajectory of the chain, that traps the mixing time $t_{text{mix}}$ of the chain at a prescribed confidence level. The interval does not require the knowledge of any parameters of the chain. This stands in contrast to previous approaches, which either only provide point estimates, or require a reset mechanism, or additional prior knowledge. The interval is constructed around the relaxation time $t_{text{relax}} = 1/gamma$, which is strongly related to the mixing time, and the width of the interval converges to zero roughly at a $1/sqrt{n}$ rate, where $n$ is the length of the sample path.