ﻻ يوجد ملخص باللغة العربية
The predict-then-optimize framework is fundamental in many practical settings: predict the unknown parameters of an optimization problem, and then solve the problem using the predicted values of the parameters. A natural loss function in this environment is to consider the cost of the decisions induced by the predicted parameters, in contrast to the prediction error of the parameters. This loss function was recently introduced in Elmachtoub and Grigas (2017) and referred to as the Smart Predict-then-Optimize (SPO) loss. In this work, we seek to provide bounds on how well the performance of a prediction model fit on training data generalizes out-of-sample, in the context of the SPO loss. Since the SPO loss is non-convex and non-Lipschitz, standard results for deriving generalization bounds do not apply. We first derive bounds based on the Natarajan dimension that, in the case of a polyhedral feasible region, scale at most logarithmically in the number of extreme points, but, in the case of a general convex feasible region, have linear dependence on the decision dimension. By exploiting the structure of the SPO loss function and a key property of the feasible region, which we denote as the strength property, we can dramatically improve the dependence on the decision and feature dimensions. Our approach and analysis rely on placing a margin around problematic predictions that do not yield unique optimal solutions, and then providing generalization bounds in the context of a modified margin SPO loss function that is Lipschitz continuous. Finally, we characterize the strength property and show that the modified SPO loss can be computed efficiently for both strongly convex bodies and polytopes with an explicit extreme point representation.
The predict-then-optimize framework is fundamental in practical stochastic decision-making problems: first predict unknown parameters of an optimization model, then solve the problem using the predicted values. A natural loss function in this setting
Many real-world analytics problems involve two significant challenges: prediction and optimization. Due to the typically complex nature of each challenge, the standard paradigm is predict-then-optimize. By and large, machine learning tools are intend
In the predict-then-optimize framework, the objective is to train a predictive model, mapping from environment features to parameters of an optimization problem, which maximizes decision quality when the optimization is subsequently solved. Recent wo
The vicinal risk minimization (VRM) principle, first proposed by citet{vapnik1999nature}, is an empirical risk minimization (ERM) variant that replaces Dirac masses with vicinal functions. Although there is strong numerical evidence showing that VRM
We study the generalization properties of the popular stochastic optimization method known as stochastic gradient descent (SGD) for optimizing general non-convex loss functions. Our main contribution is providing upper bounds on the generalization er