1: \begin{abstract}
2: This work provides simple algorithms for multi-class (and
3: multi-label) prediction in settings where both the number of
4: examples $n$ and the data dimension $d$ are relatively
5: large. These robust and parameter free algorithms are essentially
6: iterative least-squares updates and very versatile both in theory
7: and in practice. On the theoretical front, we present several
8: variants with convergence guarantees. Owing to their effective use
9: of second-order structure, these algorithms are substantially
10: better than first-order methods in many practical scenarios. On
11: the empirical side, we present a scalable stagewise variant of our
12: approach, which achieves dramatic computational speedups over
13: popular optimization packages such as Liblinear and Vowpal Wabbit
14: on standard datasets (MNIST and CIFAR-10), while attaining
15: state-of-the-art accuracies.
16: \end{abstract}
17: