10c475ebb3afc744.tex
1: \begin{abstract}
2: 
3: Though second-order optimization methods are highly effective, 
4: popular approaches in machine learning such as SGD and Adam use only first-order information due to the difficulty of computing curvature in high dimensions.
5: %
6: We present FOSI, a novel meta-algorithm that improves the performance of any first-order optimizer by efficiently incorporating second-order information during the optimization process.
7: In each iteration, FOSI implicitly splits the function into two quadratic functions defined on orthogonal subspaces, then uses a second-order method to minimize the first, and the base optimizer to minimize the other.
8: %
9: We prove FOSI converges and further show it improves the condition number for a large family of optimizers.
10: Our empirical evaluation demonstrates that FOSI improves the convergence rate and optimization time of GD, Heavy-Ball, and Adam when applied to several deep neural networks training tasks such as audio classification, transfer learning, and object classification, as well as when applied to convex functions.
11: Furthermore, our results show that FOSI outperforms other second-order methods such as K-FAC and L-BFGS.
12: 
13: \end{abstract}
14: