abstract:fcba79e754c96ad5.tex

1: \begin{abstract}

2:   Reinforcement learning has traditionally focused on learning state-dependent policies to solve optimal control problems in a \emph{closed-loop} fashion.

3:   In this work, we introduce the paradigm of \emph{open-loop reinforcement learning} where a fixed action sequence is learned instead.

4:   We present three new algorithms: one robust model-based method and two sample-efficient model-free methods.

5:   Rather than basing our algorithms on Bellman's equation from dynamic programming, our work builds on \emph{Pontryagin's principle} from the theory of open-loop optimal control.

6:   We provide convergence guarantees and evaluate all methods empirically on a pendulum swing-up task, as well as on two high-dimensional MuJoCo tasks, demonstrating remarkable performance compared to existing baselines.

7: \end{abstract}

8: