e5a926cffcb36af7.tex
1: \begin{abstract}
2: Gradient-based  methods have been widely used for system design and optimization in diverse application domains.   
3: Recently, there has been a renewed
4: interest in studying theoretical  properties of these methods in the context of control and reinforcement learning. 
5: This article surveys some of the recent developments  on policy optimization, a gradient-based iterative approach for feedback control synthesis, 
6: popularized by successes of reinforcement learning. 
7: We take  an interdisciplinary perspective in our exposition that
8: connects control theory, reinforcement learning, and large-scale optimization. We  review a number of recently-developed theoretical results on the optimization landscape, global convergence, and sample complexity of gradient-based methods for various continuous  control problems such as the linear quadratic regulator (LQR), $\mathcal{H}_\infty$ control, risk-sensitive control, linear quadratic Gaussian (LQG) control, and output feedback synthesis. In conjunction with these optimization results, we  also discuss how direct policy %search  
9: optimization handles stability and robustness concerns in learning-based control, two main desiderata in control engineering. 
10: We conclude the survey by pointing out several challenges and  opportunities at the intersection of 
11: learning and control.
12: \end{abstract}
13: