abstract:ebbb8266a05f6240.tex

1: \begin{abstract}%

2: In this paper, we revisit model-free policy search on

3: an important robust control benchmark, namely  \(\mu\) synthesis. In the general output-feedback setting, there do not exist convex formulations for this problem, and hence global optimality guarantees are not expected. \citet{apkarian2011nonsmooth} presented a nonconvex nonsmooth policy optimization approach for this problem, and achieved state-of-the-art design results via using subgradient-based policy search algorithms which generate update directions in a model-based manner. Despite the lack of convexity and global optimality guarantees, these subgradient-based policy search methods have led to impressive numerical results in practice.

4: Built upon such a policy optimization persepctive, our paper extends these subgradient-based search methods to a model-free setting. Specifically, we examine the effectiveness of two model-free policy optimization strategies:  the model-free non-derivative sampling method and the zeroth-order policy search with uniform smoothing.  We performed an extensive numerical study to demonstrate that both methods consistently replicate the design outcomes achieved by their model-based counterparts. Additionally, we provide some theoretical justifications showing that convergence guarantees to stationary points can be established for our model-free $\mu$-synthesis under some assumptions related to the coerciveness of the cost function.

5: Overall, our results demonstrate that derivative-free policy optimization offers a competitive and viable approach for solving general output-feedback \(\mu\)-synthesis problems in the model-free setting.

6:

7:

8: \end{abstract}

9: