proof:b1c6b2887a91d652.tex

1: \begin{proof}({\bf Theorem \ref{thm:arxiv-spsa-theta-convergence}})

2: In lieu of (A6), it is sufficient to analyse the following equivalent update rule for MCPG on the high-probability set $E^\eta$:

3: %

4: \begin{align*}

5: &\theta_i(t+1)  = \Gamma_i \bigg( \theta_i(t) -  a(t) \frac{J^{\theta(t) +\delta\Delta(t)}(x_0) - J^{\theta(t)-\delta\Delta(t)}(x_0)}{2\delta\Delta_i(t)}

6:             \bigg).

7: \end{align*}

8: Now, using a standard Taylor series expansion (see Chapter 5 of \citep{Bhatnagar13SR}) it is easy to show that  $\dfrac{J^{\theta +\delta\Delta}(x_0) - J^{\theta-\delta\Delta}(x_0)}{2\delta\Delta_i(t)}$ is a biased estimator of $\nabla_\theta J^\theta(x_0)$, where the bias vanishes asymptotically. In more rigorous terms, we have

9: %

10: \begin{align*}

11: \frac{J^{\theta +\delta\Delta}(x_0) - J^{\theta-\delta\Delta}(x_0)}{2\delta\Delta_i(t)}

12: &\longrightarrow_{\beta \rightarrow 0} \nabla_{\theta_i} J^\theta(x_0).

13: \end{align*}

14: %

15: Thus, Eq.~\ref{eq:spsa-update-rule} can be seen to be a discretization of the ODE~\eqref{eq:arxiv-theta-ode}. Further, $\Z_\lambda$ is an asymptotically stable attractor for the ODE~\eqref{eq:arxiv-theta-ode}, with $J^\theta(x_0)$ itself serving as a strict Lyapunov function. This can be inferred as follows:

16: %

17: \begin{align*}

18: \dfrac{d J^\theta(x_0)}{d t}

19: = \nabla_\theta J^\theta(x_0) \dot \theta

20: = \nabla_\theta J^\theta(x_0)) \bar\Gamma\big(-\nabla_\theta J^\theta(x_0)\big) < 0.

21: \end{align*}

22: %

23: The claim now follows from Theorem 5.3.3, pp. 191-196 of~\citep{kushner-clark}. Note that the final claim holds on $E^\eta$, the set with high probability on which the bias of the MFMC estimator is bounded.

24: % rewriting Lemma 1 from \cite{borkar2008stochastic} with the deterministic sequence $\xi(t)$ replacing the martingale difference noise there.

25: \end{proof}

26: