1: \begin{abstract}
2: Entropy regularized Markov decision processes have been widely used in reinforcement learning.
3: This paper is concerned with the primal-dual formulation of the entropy regularized problems.
4: Standard first order methods suffer from slow convergence due to the lack of strict convexity and
5: concavity. To address this issue, we first introduce a new quadratically convexified primal-dual
6: formulation. The natural gradient ascent descent of the new formulation enjoys global convergence
7: guarantee and exponential convergence rate. We also propose a new interpolating metric that
8: further accelerates the convergence significantly. Numerical results are provided to demonstrate
9: the performance of the proposed methods under multiple settings.
10: \end{abstract}
11: