1: \begin{abstract}
2: Efficient computation of the optimal transport distance between two distributions serves as an algorithm subroutine that empowers various applications.
3: This paper develops a scalable first-order optimization-based method that computes optimal transport to within $\varepsilon$ additive accuracy
4: with runtime $\widetilde{O}( n^2/\varepsilon)$, where $n$ denotes the dimension of the probability distributions of interest.
5: Our algorithm achieves the state-of-the-art computational guarantees among all first-order methods, while exhibiting favorable numerical performance compared to classical algorithms like Sinkhorn and Greenkhorn.
6: Underlying our algorithm designs are two key elements: (a) converting the original problem into a bilinear minimax problem over probability distributions;
7: (b) exploiting the extragradient idea --- in conjunction with entropy regularization and adaptive learning rates --- to accelerate convergence.
8: \end{abstract}
9: