abstract:a004235562cb3fd7.tex

1: \begin{abstract}\vspace{-1mm}

2: Robotic systems must be able to quickly and robustly make decisions when operating in uncertain and dynamic environments. While Reinforcement Learning (RL) can be used to compute optimal policies with little prior knowledge about the environment, it suffers from slow convergence. %In addition, since learning in RL occurs at the time scale of rollouts,  generalization is poor in situations where either the environment or the dynamics changes faster than the time scale of learning.

3: An alternative approach is Model Predictive Control (MPC), which optimizes policies quickly, % (every few time steps) and therefore is more reactive than RL. However, MPC

4: but also requires accurate models of the system dynamics and environment.

5: In this paper we propose a new approach, adaptive probabilistic trajectory optimization, that combines the benefits of RL and MPC. Our method uses scalable approximate inference to learn and updates probabilistic models in an online incremental fashion while also computing optimal control policies via successive local approximations. %The theoretical foundation for the proposed optimization framework is scalable approximate inference, which is the major computational bottleneck for many approaches.  In this regard

6: We present two variations of our algorithm % efficient approximate inference methods

7: based on the Sparse Spectrum Gaussian Process (SSGP) model, %Finally,

8: and we test our algorithm on three learning tasks, demonstrating the effectiveness and efficiency of our approach.

9: \end{abstract}

10: