abstract:67fcf184b835a989.tex

1: \begin{abstract}

2:   Entropy regularized Markov decision processes have been widely used in reinforcement learning.

3:   This paper is concerned with the primal-dual formulation of the entropy regularized problems.

4:   Standard first order methods suffer from slow convergence due to the lack of strict convexity and

5:   concavity. To address this issue, we first introduce a new quadratically convexified primal-dual

6:   formulation. The natural gradient ascent descent of the new formulation enjoys global convergence

7:   guarantee and exponential convergence rate. We also propose a new interpolating metric that

8:   further accelerates the convergence significantly. Numerical results are provided to demonstrate

9:   the performance of the proposed methods under multiple settings.

10: \end{abstract}

11: