We provide a characterization of the set of real-valued functions that can be the value function of some polynomial game. Specifically, we prove that a function $u : dR to dR$ is the value function of some polynomial game if and only if $u$ is a continuous piecewise rational function.
We obtain several game characterizations of Baire 1 functions between Polish spaces X, Y which extends the recent result of V. Kiss. Then we propose similar characterizations for equi-Bare 1 families of functions. Also, using similar ideas, we give game characterizations of Baire measurable and Lebesgue measurable functions.
In this paper, we propose a pseudo polynomial size LP formulation for finding a payoff vector in the least core of a weighted voting game. The numbers of variables and constraints in our formulation are both bounded by $mbox{O}(n W_+)$, where $n$ is the number of players and $W_+$ is the total sum of (integer) voting weights. When we employ our formulation, a commercial LP solver calculates a payoff vector in the least core of practical weighted voting games in a few seconds. We also extend our approach to vector weighted voting games.
Many reinforcement learning (RL) environments in practice feature enormous state spaces that may be described compactly by a factored structure, that may be modeled by Factored Markov Decision Processes (FMDPs). We present the first polynomial-time algorithm for RL with FMDPs that does not rely on an oracle planner, and instead of requiring a linear transition model, only requires a linear value function with a suitable local basis with respect to the factorization. With this assumption, we can solve FMDPs in polynomial time by constructing an efficient separation oracle for convex optimization. Importantly, and in contrast to prior work, we do not assume that the transitions on various factors are independent.
We investigate a way to approximate the maximum of a polynomial over a polytopal region by using Handelmans polynomial decomposition and continuous multivariate generating functions. The maximization problem is NP-hard, but our approximation methods will run in polynomial time when the dimension is fixed.
We consider dynamic programming problems with finite, discrete-time horizons and prohibitively high-dimensional, discrete state-spaces for direct computation of the value function from the Bellman equation. For the case that the value function of the dynamic program is concave extensible and submodular in its state-space, we present a new algorithm that computes deterministic upper and stochastic lower bounds of the value function similar to dual dynamic programming. We then show that the proposed algorithm terminates after a finite number of iterations. Finally, we demonstrate the efficacy of our approach on a high-dimensional numerical example from delivery slot pricing in attended home delivery.