We give a brief presentation of the capacity theory and show how it derives naturally a measurable selection theorem following the approach of Dellacherie (1972). Then we present the classical method to prove the dynamic programming of discrete time stochastic control problem, using measurable selection arguments. At last, we propose a continuous time extension, that is an abstract framework for the continuous time dynamic programming principle (DPP).
We aim to give an overview on how to derive the dynamic programming principle for a general stochastic control/stopping problem, using measurable selection techniques. By considering their martingale problem formulation, we show how to check the required measurability conditions for differe
For years, there has been interest in approximation methods for solving dynamic programming problems, because of the inherent complexity in computing optimal solutions characterized by Bellmans principle of optimality. A wide range of approximate dynamic programming (ADP) methods now exists. It is of great interest to guarantee that the performance of an ADP scheme be at least some known fraction, say $beta$, of optimal. This paper introduces a general approach to bounding the performance of ADP methods, in this sense, in the stochastic setting. The approach is based on new results for bounding greedy solutions in string optimization problems, where one has to choose a string (ordered set) of actions to maximize an objective function. This bounding technique is inspired by submodularity theory, but submodularity is not required for establishing bounds. Instead, the bounding is based on quantifying certain notions of curvature of string functions; the smaller the curvatures the better the bound. The key insight is that any ADP scheme is a greedy scheme for some surrogate string objective function that coincides in its optimal solution and value with those of the original optimal control problem. The ADP scheme then yields to the bounding technique mentioned above, and the curvatures of the surrogate objective determine the value $beta$ of the bound. The surrogate objective and its curvatures depend on the specific ADP.
This paper discusses the odds problem, proposed by Bruss in 2000, and its variants. A recurrence relation called a dynamic programming (DP) equation is used to find an optimal stopping policy of the odds problem and its variants. In 2013, Buchbinder, Jain, and Singh proposed a linear programming (LP) formulation for finding an optimal stopping policy of the classical secretary problem, which is a special case of the odds problem. The proposed linear programming problem, which maximizes the probability of a win, differs from the DP equations known for long time periods. This paper shows that an ordinary DP equation is a modification of the dual problem of linear programming including the LP formulation proposed by Buchbinder, Jain, and Singh.
We propose a discretization of the optimality principle in dynamic programming based on radial basis functions and Shepards moving least squares approximation method. We prove convergence of the approximate optimal value function to the true one and present several numerical experiments.
(Renegar, 2016) introduced a novel approach to transforming generic conic optimization problems into unconstrained, uniformly Lipschitz continuous minimization. We introduce radial transformations generalizing these ideas, equipped with an entirely new motivation and development that avoids any reliance on convex cones or functions. Perhaps of greatest practical importance, this facilitates the development of new families of projection-free first-order methods applicable even in the presence of nonconvex objectives and constraint sets. Our generalized construction of this radial transformation uncovers that it is dual (i.e., self-inverse) for a wide range of functions including all concave objectives. This gives a powerful new duality relating optimization problems to their radially dual problem. For a broad class of functions, we characterize continuity, differentiability, and convexity under the radial transformation as well as develop a calculus for it. This radial duality provides a strong foundation for designing projection-free radial optimization algorithms, which is carried out in the second part of this work.