abstract:643ffc2a921a2eb3.tex

1: \begin{abstract}

2: We propose a policy iteration algorithm for solving the multiplicative noise linear quadratic output feedback design problem. The algorithm solves a set of coupled Riccati equations for estimation and control arising from a partially observable Markov decision process (POMDP) under a class of linear dynamic control policies.

3: We show in numerical experiments far faster convergence than a value iteration algorithm, formerly the only known algorithm for solving this class of problem.

4: The results suggest promising future research directions for policy optimization algorithms in more general POMDPs, including the potential to develop novel approximate data-driven approaches when model parameters are not available.

5: \end{abstract}

6: