50ccc1548bcde59a.tex
1: \begin{abstract}
2:         Monte-Carlo counterfactual regret minimization (MCCFR) is the
3:         state-of-the-art algorithm for solving sequential games that are
4:         too large for full tree traversals. It works by using gradient
5:         estimates that can be computed via sampling. However, stochastic
6:         methods for sequential games have not been investigated extensively
7:         beyond MCCFR. In this paper we develop a new framework for
8:         developing stochastic regret minimization methods. This framework
9:         allows us to use any regret-minimization algorithm, coupled with
10:         any gradient estimator. The MCCFR algorithm can be analyzed as a
11:         special case of our framework, and this analysis leads to
12:         significantly-stronger theoretical guarantees on convergence, while
13:         simultaneously yielding a simplified proof. Our framework allows us
14:         to instantiate several new stochastic methods for solving
15:         sequential games. We show extensive experiments on three games,
16:         where some variants of our methods outperform MCCFR.
17:     \end{abstract}
18: