abstract:b9649751e49df6fd.tex

1: \begin{abstract}

2: We consider Linear Stochastic Approximation (LSA) with constant stepsize

3: and Markovian data. Viewing the joint process of the data and LSA

4: iterate as a time-homogeneous Markov chain, we prove its convergence

5: to a unique limiting and stationary distribution in Wasserstein distance

6: and establish non-asymptotic, geometric convergence rates. Furthermore,

7: we show that the bias vector of this limit admits an infinite series

8: expansion with respect to the stepsize. Consequently, the bias is

9: proportional to the stepsize up to higher order terms. This result

10: stands in contrast with LSA under i.i.d.\ data, for which the bias

11: vanishes. In the reversible chain setting, we provide a general characterization

12: of the relationship between the bias and the mixing time of the Markovian

13: data, establishing that they are roughly proportional to each other.

14:

15: While Polyak-Ruppert tail-averaging reduces the variance of LSA iterates,

16: it does not affect the bias. The above characterization allows us

17: to show that the bias can be reduced using Richardson-Romberg extrapolation

18: with $m\ge2$ stepsizes, which eliminates the $m-1$ leading terms

19: in the bias expansion. This extrapolation scheme leads to an exponentially

20: smaller bias and an improved mean squared error, both theoretically

21: and empirically. Our results immediately apply to the Temporal Difference

22: learning algorithm with linear function approximation, and stochastic

23: gradient descent applied to quadratic functions.

24: \end{abstract}

25: