No Arabic abstract
We propose $ell_1$ norm regularized quadratic surface support vector machine models for binary classification in supervised learning. We establish their desired theoretical properties, including the existence and uniqueness of the optimal solution, reduction to the standard SVMs over (almost) linearly separable data sets, and detection of true sparsity pattern over (almost) quadratically separable data sets if the penalty parameter of $ell_1$ norm is large enough. We also demonstrate their promising practical efficiency by conducting various numerical experiments on both synthetic and publicly available benchmark data sets.
We propose a method for support vector machine classification using indefinite kernels. Instead of directly minimizing or stabilizing a nonconvex loss function, our algorithm simultaneously computes support vectors and a proxy kernel matrix used in forming the loss. This can be interpreted as a penalized kernel learning problem where indefinite kernel matrices are treated as a noisy observations of a true Mercer kernel. Our formulation keeps the problem convex and relatively large problems can be solved efficiently using the projected gradient or analytic center cutting plane methods. We compare the performance of our technique with other methods on several classic data sets.
The curse of dimensionality is a recognized challenge in nonparametric estimation. This paper develops a new L0-norm regularization approach to the convex quantile and expectile regressions for subset variable selection. We show how to use mixed integer programming to solve the proposed L0-norm regularization approach in practice and build a link to the commonly used L1-norm regularization approach. A Monte Carlo study is performed to compare the finite sample performances of the proposed L0-penalized convex quantile and expectile regression approaches with the L1-norm regularization approaches. The proposed approach is further applied to benchmark the sustainable development performance of the OECD countries and empirically analyze the accuracy in the dimensionality reduction of variables. The results from the simulation and application illustrate that the proposed L0-norm regularization approach can more effectively address the curse of dimensionality than the L1-norm regularization approach in multidimensional spaces.
We give a formal and complete characterization of the explicit regularizer induced by dropout in deep linear networks with squared loss. We show that (a) the explicit regularizer is composed of an $ell_2$-path regularizer and other terms that are also re-scaling invariant, (b) the convex envelope of the induced regularizer is the squared nuclear norm of the network map, and (c) for a sufficiently large dropout rate, we characterize the global optima of the dropout objective. We validate our theoretical findings with empirical results.
In this paper, we propose a way to combine two acceleration techniques for the $ell_{1}$-regularized least squares problem: safe screening tests, which allow to eliminate useless dictionary atoms; and the use of fast structured approximations of the dictionary matrix. To do so, we introduce a new family of screening tests, termed stable screening, which can cope with approximation errors on the dictionary atoms while keeping the safety of the test (i.e. zero risk of rejecting atoms belonging to the solution support). Some of the main existing screening tests are extended to this new framework. The proposed algorithm consists in using a coarser (but faster) approximation of the dictionary at the initial iterations and then switching to better approximations until eventually adopting the original dictionary. A systematic switching criterion based on the duality gap saturation and the screening ratio is derived.Simulation results show significant reductions in both computational complexity and execution times for a wide range of tested scenarios.
l1-norm quantile regression is a common choice if there exists outlier or heavy-tailed error in high-dimensional data sets. However, it is computationally expensive to solve this problem when the feature size of data is ultra high. As far as we know, existing screening rules can not speed up the computation of the l1-norm quantile regression, which dues to the non-differentiability of the quantile function/pinball loss. In this paper, we introduce the dual circumscribed sphere technique and propose a novel l1-norm quantile regression screening rule. Our rule is expressed as the closed-form function of given data and eliminates inactive features with a low computational cost. Numerical experiments on some simulation and real data sets show that this screening rule can be used to eliminate almost all inactive features. Moreover, this rule can help to reduce up to 23 times of computational time, compared with the computation without our screening rule.