Multivariate convex regression: global risk bounds and adaptation


Abstract in English

We study the problem of estimating a multivariate convex function defined on a convex body in a regression setting with random design. We are interested in optimal rates of convergence under a squared global continuous $l_2$ loss in the multivariate setting $(dgeq 2)$. One crucial fact is that the minimax risks depend heavily on the shape of the support of the regression function. It is shown that the global minimax risk is on the order of $n^{-2/(d+1)}$ when the support is sufficiently smooth, but that the rate $n^{-4/(d+4)}$ is when the support is a polytope. Such differences in rates are due to difficulties in estimating the regression function near the boundary of smooth regions. We then study the natural bounded least squares estimators (BLSE): we show that the BLSE nearly attains the optimal rates of convergence in low dimensions, while suffering rate-inefficiency in high dimensions. We show that the BLSE adapts nearly parametrically to polyhedral functions when the support is polyhedral in low dimensions by a local entropy method. We also show that the boundedness constraint cannot be dropped when risk is assessed via continuous $l_2$ loss. Given rate sub-optimality of the BLSE in higher dimensions, we further study rate-efficient adaptive estimation procedures. Two general model selection methods are developed to provide sieved adaptive estimators (SAE) that achieve nearly optimal rates of convergence for particular regular classes of convex functions, while maintaining nearly parametric rate-adaptivity to polyhedral functions in arbitrary dimensions. Interestingly, the uniform boundedness constraint is unnecessary when risks are measured in discrete $l_2$ norms.

Download