Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning

Sergey Samsonov, Eric Moulines, Qi Man Shao, Zhuo Song Zhang, Alexey Naumov

Research output: Contribution to journalConference article published in journalpeer-review

Abstract

In this paper, we obtain the Berry-Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Moreover, we prove the non-asymptotic validity of the confidence intervals for parameter estimation with LSA based on multiplier bootstrap. This procedure updates the LSA estimate together with a set of randomly perturbed LSA estimates upon the arrival of subsequent observations. We illustrate our findings in the setting of temporal difference learning with linear function approximation.

Original languageEnglish
JournalAdvances in Neural Information Processing Systems
Volume37
Publication statusPublished - 2024
Externally publishedYes
Event38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, Canada
Duration: 9 Dec 202415 Dec 2024

Bibliographical note

Publisher Copyright:
© 2024 Neural information processing systems foundation. All rights reserved.

Fingerprint

Dive into the research topics of 'Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning'. Together they form a unique fingerprint.

Cite this