abstract:69fe5ab74a1ae70a.tex

1: \begin{abstract}

2:    Regret-based algorithms are highly efficient at finding approximate Nash equilibria in sequential games such as poker games.

3:    However, most regret-based algorithms, including counterfactual regret minimization (CFR) and its variants, rely on iterate averaging to achieve convergence.

4:    Inspired by recent advances on \emph{last-iterate} convergence of optimistic algorithms in zero-sum normal-form games, we study this phenomenon in sequential games, and provide a comprehensive study of last-iterate convergence for zero-sum extensive-form games with perfect recall (EFGs), using various optimistic regret-minimization algorithms over treeplexes.

5:    This includes algorithms using the vanilla entropy or squared Euclidean norm regularizers, as well as their dilated versions which admit more efficient implementation.

6:    In contrast to CFR, we show that all of these algorithms enjoy last-iterate convergence, with some of them even converging \emph{exponentially} fast.

7:    We also provide experiments to further support our theoretical results.

8: \end{abstract}