1ca8147b65a1e771.tex
1: \begin{abstract}
2: 		This paper analyzes a two-timescale stochastic algorithm framework for bilevel optimization. 
3: 		Bilevel optimization is a class of problems which exhibits a two-level structure, and its goal is to minimize an outer objective function with variables which are constrained to be the optimal solution to an (inner) optimization problem.
4: 		We consider the case when the inner problem is unconstrained and strongly convex, while the outer problem is constrained and has a smooth objective function. We propose a two-timescale stochastic approximation ({\sf TTSA}) algorithm for tackling such a bilevel  problem. In the algorithm, a stochastic gradient update with a larger step size is used for the inner problem, while a projected stochastic gradient  update with a smaller step size is used for the outer problem. 
5: 		We analyze the convergence rates for the {\sf TTSA} algorithm under various settings: when the outer problem is strongly convex (resp.~weakly convex), the {\sf TTSA} algorithm finds an $\mathcal{O}(K_{\max}^{-2/3})$-optimal  (resp.~$\mathcal{O}(K_{\max}^{-2/5})$-stationary) solution, where $K_{\max}$ is the total iteration number. 
6: 		As an application, we show that a two-timescale natural actor-critic proximal policy optimization algorithm can be viewed as a special case of our {\sf TTSA} framework. Importantly, the natural actor-critic algorithm is shown to converge at a rate of $\mathcal{O}(K_{\max}^{-1/4})$ in terms of the gap in expected discounted reward compared to a global optimal policy. 
7: 	\end{abstract}
8: