1: \begin{abstract}
2: This paper analyzes a two-timescale stochastic algorithm framework for bilevel optimization.
3: Bilevel optimization is a class of problems which exhibits a two-level structure, and its goal is to minimize an outer objective function with variables which are constrained to be the optimal solution to an (inner) optimization problem.
4: We consider the case when the inner problem is unconstrained and strongly convex, while the outer problem is constrained and has a smooth objective function. We propose a two-timescale stochastic approximation ({\sf TTSA}) algorithm for tackling such a bilevel problem. In the algorithm, a stochastic gradient update with a larger step size is used for the inner problem, while a projected stochastic gradient update with a smaller step size is used for the outer problem.
5: We analyze the convergence rates for the {\sf TTSA} algorithm under various settings: when the outer problem is strongly convex (resp.~weakly convex), the {\sf TTSA} algorithm finds an $\mathcal{O}(K_{\max}^{-2/3})$-optimal (resp.~$\mathcal{O}(K_{\max}^{-2/5})$-stationary) solution, where $K_{\max}$ is the total iteration number.
6: As an application, we show that a two-timescale natural actor-critic proximal policy optimization algorithm can be viewed as a special case of our {\sf TTSA} framework. Importantly, the natural actor-critic algorithm is shown to converge at a rate of $\mathcal{O}(K_{\max}^{-1/4})$ in terms of the gap in expected discounted reward compared to a global optimal policy.
7: \end{abstract}
8: