1: \begin{abstract}
2: The Frank-Wolfe algorithm has regained much interest
3: in its use
4: %since it has been successfully used
5: in structurally constrained machine learning applications. However, one major limitation of the Frank-Wolfe algorithm is the slow local convergence property due to the zig-zagging behavior.
6: We observe that this zig-zagging phenomenon can be viewed as an artifact of discretization, as when the method is viewed as an Euler discretization of a continuous time flow, that flow does not zig-zag.
7: %In contrast to previous methods that directly break the behavior, we figure out the intuition behind this behavior, which is an artifact of truncation discretization error.
8: For this reason, we propose multistep Frank-Wolfe variants based on discretizations of the same flow whose truncation errors decay as $O(\Delta^p)$, where $p$ is the method's order.
9: This strategy ``stabilizes" the method, and allows tools like line search and momentum to have more benefit. However, in terms of a convergence rate, our result is ultimately negative, suggesting that no Runge-Kutta-type discretization scheme can achieve a better convergence rate than the vanilla Frank-Wolfe method.
10: We believe that this analysis adds to the growing knowledge of flow analysis for optimization methods, and is a cautionary tale on the ultimate usefulness of multistep methods.
11: \end{abstract}
12: