abstract:580a59a4b509b9bd.tex

1: \begin{abstract}

2: We consider the problem of computing optimal policies in average-reward Markov decision processes.

3: This classical problem can be formulated as a linear program directly amenable to saddle-point

4: optimization methods, albeit with a number of variables that is linear in the number of states. To

5: address this issue, recent work has considered a linearly relaxed version of the resulting

6: saddle-point problem. Our work aims at achieving a better understanding of this relaxed

7: optimization problem by characterizing the conditions necessary for convergence to the

8: optimal policy, and designing an optimization algorithm enjoying fast convergence rates that are

9: independent of the size of the state space. Notably, our characterization points out some potential

10: issues with previous work.

11: \end{abstract}

12: