Bias-Variance Trade-off and Overlearning in Dynamic Decision Problems

Abstract in English

Modern Monte Carlo-type approaches to dynamic decision problems are reformulated as empirical loss minimization, allowing direct applications of classical results from statistical machine learning. These computational methods are then analyzed in this framework to demonstrate their effectiveness as well as their susceptibility to generalization error. Standard uses of classical results prove potential overlearning, thus bias-variance trade-off, by connecting over-trained networks to anticipating controls. On the other hand, non-asymptotic estimates based on Rademacher complexity show the convergence of these algorithms for sufficiently large training sets. A numerically studied stylized example illustrates these possibilities, including the importance of problem dimension in the degree of overlearning, and the effectiveness of this approach.
