b9649751e49df6fd.tex
1: \begin{abstract}
2: We consider Linear Stochastic Approximation (LSA) with constant stepsize
3: and Markovian data. Viewing the joint process of the data and LSA
4: iterate as a time-homogeneous Markov chain, we prove its convergence
5: to a unique limiting and stationary distribution in Wasserstein distance
6: and establish non-asymptotic, geometric convergence rates. Furthermore,
7: we show that the bias vector of this limit admits an infinite series
8: expansion with respect to the stepsize. Consequently, the bias is
9: proportional to the stepsize up to higher order terms. This result
10: stands in contrast with LSA under i.i.d.\ data, for which the bias
11: vanishes. In the reversible chain setting, we provide a general characterization
12: of the relationship between the bias and the mixing time of the Markovian
13: data, establishing that they are roughly proportional to each other.
14: 
15: While Polyak-Ruppert tail-averaging reduces the variance of LSA iterates,
16: it does not affect the bias. The above characterization allows us
17: to show that the bias can be reduced using Richardson-Romberg extrapolation
18: with $m\ge2$ stepsizes, which eliminates the $m-1$ leading terms
19: in the bias expansion. This extrapolation scheme leads to an exponentially
20: smaller bias and an improved mean squared error, both theoretically
21: and empirically. Our results immediately apply to the Temporal Difference
22: learning algorithm with linear function approximation, and stochastic
23: gradient descent applied to quadratic functions.
24: \end{abstract}
25: