Finite Sample Analyses for TD(0) with Function Approximation

نشر في Gal Dalal بتاريخ 2017 في مجال الهندسة المعلوماتية والبحث باللغة English تحميل البحث

الملخص بالإنكليزية

TD(0) is one of the most commonly used algorithms in reinforcement learning. Despite this, there is no existing finite sample analysis for TD(0) with function approximation, even for the linear case. Our work is the first to provide such results. Existing convergence rates for Temporal Difference (TD) methods apply only to somewhat modifi

تحميل البحث