abstract:50ccc1548bcde59a.tex

1: \begin{abstract}

2:         Monte-Carlo counterfactual regret minimization (MCCFR) is the

3:         state-of-the-art algorithm for solving sequential games that are

4:         too large for full tree traversals. It works by using gradient

5:         estimates that can be computed via sampling. However, stochastic

6:         methods for sequential games have not been investigated extensively

7:         beyond MCCFR. In this paper we develop a new framework for

8:         developing stochastic regret minimization methods. This framework

9:         allows us to use any regret-minimization algorithm, coupled with

10:         any gradient estimator. The MCCFR algorithm can be analyzed as a

11:         special case of our framework, and this analysis leads to

12:         significantly-stronger theoretical guarantees on convergence, while

13:         simultaneously yielding a simplified proof. Our framework allows us

14:         to instantiate several new stochastic methods for solving

15:         sequential games. We show extensive experiments on three games,

16:         where some variants of our methods outperform MCCFR.

17:     \end{abstract}

18: