abstract:d1bf7adc70e4582d.tex

1: \begin{abstract}

2: % In this article, we derive two new algorithms tailored to discrete-time finite-horizon nonlinear optimal control problems for deterministic systems or so-called trajectory optimization problems. Both algorithms are inspired by a novel theoretical paradigm known as probabilistic optimal control. The main idea here is to reformulate the optimal control problem as a probabilistic inference or estimation problem that is formally equivalent to the optimal control problem.  As a result, the optimal control problem can now be addressed using the Expectation-Maximization algorithm. In this work, it is shown how this results in a fixed point iteration of probabilistic policies that converges to the deterministic optimal policy in the limit. We further discuss two strategies to evaluate the policies and rely on state-of-the-art uncertainty quantification methods to distil two new algorithms. The algorithms are structurally closest related to the well-established differential dynamic programming algorithm and contemporary relatives that use sigma-point methods to avoid direct gradient evaluations. The main advantage of our algorithms is an improved balance between exploration and exploitation over the iterations, ultimately resulting in improved numerical stability and accelerated convergence. These and other properties are demonstrated on different nonlinear systems.

3: % \end{abstract}

4: