1: \begin{abstract}
2: Robust Reinforcement Learning aims to derive an optimal behavior that accounts for model uncertainty
3: in dynamical systems. However, previous studies have shown that by considering the worst case scenario,
4: robust policies can be overly conservative. Our \textit{soft-robust} framework is an attempt to overcome
5: this issue. In this paper, we present a novel Soft-Robust Actor-Critic algorithm (SR-AC).
6: It learns an optimal policy with respect to a distribution over an uncertainty set and stays robust to model uncertainty but avoids
7: the conservativeness of robust strategies. We show the convergence of SR-AC and test the efficiency of our approach
8: on different domains by comparing it against regular learning methods and their robust formulations.
9: \end{abstract}
10: