Finite Sample Analyses for TD(0) with Function Approximation


الملخص بالإنكليزية

TD(0) is one of the most commonly used algorithms in reinforcement learning. Despite this, there is no existing finite sample analysis for TD(0) with function approximation, even for the linear case. Our work is the first to provide such results. Existing convergence rates for Temporal Difference (TD) methods apply only to somewhat modifi

تحميل البحث