ﻻ يوجد ملخص باللغة العربية
Many modern statistical estimation problems are defined by three major components: a statistical model that postulates the dependence of an output variable on the input features; a loss function measuring the error between the observed output and the model predicted output; and a regularizer that controls the overfitting and/or variable selection in the model. We study the sampling version of this generic statistical estimation problem where the model parameters are estimated by empirical risk minimization, which involves the minimization of the empirical average of the loss function at the data points weighted by the model regularizer. In our setup we allow all three component functions discussed above to be of the difference-of-convex (dc) type and illustrate them with a host of commonly used examples, including those in continuous piecewise affine regression and in deep learning (where the activation functions are piecewise affine). We describe a nonmonotone majorization-minimization (MM) algorithm for solving the unified nonconvex, nondifferentiable optimization problem which is formulated as a specially structured composite dc program of the pointwise max type, and present convergence results to a directional stationary solution. An efficient semismooth Newton method is proposed to solve the dual of the MM subproblems. Numerical results are presented to demonstrate the effectiveness of the proposed algorithm and the superiority of continuous piecewise affine regression over the standard linear model.
We show that the linear or quadratic 0/1 program[P:quadmin{ c^Tx+x^TFx : :A,x =b;:xin{0,1}^n},]can be formulated as a MAX-CUT problem whose associated graph is simply related to the matrices $F$ and $A^TA$.Hence the whole arsenal of approximation tec
We study the ridge method for min-max problems, and investigate its convergence without any convexity, differentiability or qualification assumption. The central issue is to determine whether the parametric optimality formula provides a conservative
We introduce a framework, which we denote as the augmented estimate sequence, for deriving fast algorithms with provable convergence guarantees. We use this framework to construct a new first-order scheme, the Accelerated Composite Gradient Method (A
Recent applications in machine learning have renewed the interest of the community in min-max optimization problems. While gradient-based optimization methods are widely used to solve such problems, there are however many scenarios where these techni
This paper introduces to readers the new concept and methodology of confidence distribution and the modern-day distributional inference in statistics. This discussion should be of interest to people who would like to go into the depth of the statisti