Do you want to publish a course? Click here

Geometry of Log-Concave Density Estimation

112   0   0.0 ( 0 )
 Added by Bernd Sturmfels
 Publication date 2017
and research's language is English




Ask ChatGPT about the research

Shape-constrained density estimation is an important topic in mathematical statistics. We focus on densities on $mathbb{R}^d$ that are log-concave, and we study geometric properties of the maximum likelihood estimator (MLE) for weighted samples. Cule, Samworth, and Stewart showed that the logarithm of the optimal log-concave density is piecewise linear and supported on a regular subdivision of the samples. This defines a map from the space of weights to the set of regular subdivisions of the samples, i.e. the face poset of their secondary polytope. We prove that this map is surjective. In fact, every regular subdivision arises in the MLE for some set of weights with positive probability, but coarser subdivisions appear to be more likely to arise than finer ones. To quantify these results, we introduce a continuous version of the secondary polytope, whose dual we name the Samworth body. This article establishes a new link between geometric combinatorics and nonparametric statistics, and it suggests numerous open problems.



rate research

Read More

Let X_1, ..., X_n be independent and identically distributed random vectors with a log-concave (Lebesgue) density f. We first prove that, with probability one, there exists a unique maximum likelihood estimator of f. The use of this estimator is attractive because, unlike kernel density estimation, the method is fully automatic, with no smoothing parameters to choose. Although the existence proof is non-constructive, we are able to reformulate the issue of computation in terms of a non-differentiable convex optimisation problem, and thus combine techniques of computational geometry with Shors r-algorithm to produce a sequence that converges to the maximum likelihood estimate. For the moderate or large sample sizes in our simulations, the maximum likelihood estimator is shown to provide an improvement in performance compared with kernel-based methods, even when we allow the use of a theoretical, optimal fixed bandwidth for the kernel estimator that would not be available in practice. We also present a real data clustering example, which shows that our methodology can be used in conjunction with the Expectation--Maximisation (EM) algorithm to fit finite mixtures of log-concave densities. An R version of the algorithm is available in the package LogConcDEAD -- Log-Concave Density Estimation in Arbitrary Dimensions.
In Statistics, log-concave density estimation is a central problem within the field of nonparametric inference under shape constraints. Despite great progress in recent years on the statistical theory of the canonical estimator, namely the log-concave maximum likelihood estimator, adoption of this method has been hampered by the complexities of the non-smooth convex optimization problem that underpins its computation. We provide enhanced understanding of the structural properties of this optimization problem, which motivates the proposal of new algorithms, based on both randomized and Nesterov smoothing, combined with an appropriate integral discretization of increasing accuracy. We prove that these methods enjoy, both with high probability and in expectation, a convergence rate of order $1/T$ up to logarithmic factors on the objective function scale, where $T$ denotes the number of iterations. The benefits of our new computational framework are demonstrated on both synthetic and real data, and our implementation is available in a github repository texttt{LogConcComp} (Log-Concave Computation).
Nonparametric statistics for distribution functions F or densities f=F under qualitative shape constraints provides an interesting alternative to classical parametric or entirely nonparametric approaches. We contribute to this area by considering a new shape constraint: F is said to be bi-log-concave, if both log(F) and log(1 - F) are concave. Many commonly considered distributions are compatible with this constraint. For instance, any c.d.f. F with log-concave density f = F is bi-log-concave. But in contrast to the latter constraint, bi-log-concavity allows for multimodal densities. We provide various characterizations. It is shown that combining any nonparametric confidence band for F with the new shape-constraint leads to substantial improvements, particularly in the tails. To pinpoint this, we show that these confidence bands imply non-trivial confidence bounds for arbitrary moments and the moment generating function of F.
We find limiting distributions of the nonparametric maximum likelihood estimator (MLE) of a log-concave density, that is, a density of the form $f_0=expvarphi_0$ where $varphi_0$ is a concave function on $mathbb{R}$. The pointwise limiting distributions depend on the second and third derivatives at 0 of $H_k$, the lower invelope of an integrated Brownian motion process minus a drift term depending on the number of vanishing derivatives of $varphi_0=log f_0$ at the point of interest. We also establish the limiting distribution of the resulting estimator of the mode $M(f_0)$ and establish a new local asymptotic minimax lower bound which shows the optimality of our mode estimator in terms of both rate of convergence and dependence of constants on population values.
Let ${P_{theta}:theta in {mathbb R}^d}$ be a log-concave location family with $P_{theta}(dx)=e^{-V(x-theta)}dx,$ where $V:{mathbb R}^dmapsto {mathbb R}$ is a known convex function and let $X_1,dots, X_n$ be i.i.d. r.v. sampled from distribution $P_{theta}$ with an unknown location parameter $theta.$ The goal is to estimate the value $f(theta)$ of a smooth functional $f:{mathbb R}^dmapsto {mathbb R}$ based on observations $X_1,dots, X_n.$ In the case when $V$ is sufficiently smooth and $f$ is a functional from a ball in a Holder space $C^s,$ we develop estimators of $f(theta)$ with minimax optimal error rates measured by the $L_2({mathbb P}_{theta})$-distance as well as by more general Orlicz norm distances. Moreover, we show that if $dleq n^{alpha}$ and $s>frac{1}{1-alpha},$ then the resulting estimators are asymptotically efficient in Hajek-LeCam sense with the convergence rate $sqrt{n}.$ This generalizes earlier results on estimation of smooth functionals in Gaussian shift models. The estimators have the form $f_k(hat theta),$ where $hat theta$ is the maximum likelihood estimator and $f_k: {mathbb R}^dmapsto {mathbb R}$ (with $k$ depending on $s$) are functionals defined in terms of $f$ and designed to provide a higher order bias reduction in functional estimation problem. The method of bias reduction is based on iterative parametric bootstrap and it has been successfully used before in the case of Gaussian models.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا