1: \begin{proof}({\bf Theorem \ref{thm:arxiv-spsa-theta-convergence}})
2: In lieu of (A6), it is sufficient to analyse the following equivalent update rule for MCPG on the high-probability set $E^\eta$:
3: %
4: \begin{align*}
5: &\theta_i(t+1) = \Gamma_i \bigg( \theta_i(t) - a(t) \frac{J^{\theta(t) +\delta\Delta(t)}(x_0) - J^{\theta(t)-\delta\Delta(t)}(x_0)}{2\delta\Delta_i(t)}
6: \bigg).
7: \end{align*}
8: Now, using a standard Taylor series expansion (see Chapter 5 of \citep{Bhatnagar13SR}) it is easy to show that $\dfrac{J^{\theta +\delta\Delta}(x_0) - J^{\theta-\delta\Delta}(x_0)}{2\delta\Delta_i(t)}$ is a biased estimator of $\nabla_\theta J^\theta(x_0)$, where the bias vanishes asymptotically. In more rigorous terms, we have
9: %
10: \begin{align*}
11: \frac{J^{\theta +\delta\Delta}(x_0) - J^{\theta-\delta\Delta}(x_0)}{2\delta\Delta_i(t)}
12: &\longrightarrow_{\beta \rightarrow 0} \nabla_{\theta_i} J^\theta(x_0).
13: \end{align*}
14: %
15: Thus, Eq.~\ref{eq:spsa-update-rule} can be seen to be a discretization of the ODE~\eqref{eq:arxiv-theta-ode}. Further, $\Z_\lambda$ is an asymptotically stable attractor for the ODE~\eqref{eq:arxiv-theta-ode}, with $J^\theta(x_0)$ itself serving as a strict Lyapunov function. This can be inferred as follows:
16: %
17: \begin{align*}
18: \dfrac{d J^\theta(x_0)}{d t}
19: = \nabla_\theta J^\theta(x_0) \dot \theta
20: = \nabla_\theta J^\theta(x_0)) \bar\Gamma\big(-\nabla_\theta J^\theta(x_0)\big) < 0.
21: \end{align*}
22: %
23: The claim now follows from Theorem 5.3.3, pp. 191-196 of~\citep{kushner-clark}. Note that the final claim holds on $E^\eta$, the set with high probability on which the bias of the MFMC estimator is bounded.
24: % rewriting Lemma 1 from \cite{borkar2008stochastic} with the deterministic sequence $\xi(t)$ replacing the martingale difference noise there.
25: \end{proof}
26: