abstract:9d7640267da99ab5.tex

1: \begin{abstract}

2: The matrix completion problem seeks to recover a $d\times d$ ground

3: truth matrix of low rank $r\ll d$ from observations of its individual

4: elements. Real-world matrix completion is often a huge-scale optimization

5: problem, with $d$ so large that even the simplest full-dimension

6: vector operations with $O(d)$ time complexity become prohibitively

7: expensive. Stochastic gradient descent (SGD) is one of the few algorithms

8: capable of solving matrix completion on a huge scale, and can also

9: naturally handle streaming data over an evolving ground truth. Unfortunately,

10: SGD experiences a dramatic slow-down when the underlying ground truth

11: is ill-conditioned; it requires at least $O(\kappa\log(1/\epsilon))$

12: iterations to get $\epsilon$-close to ground truth matrix with condition

13: number $\kappa$. In this paper, we propose a preconditioned version

14: of SGD that preserves all the favorable practical qualities of SGD

15: for huge-scale online optimization while also making it agnostic to

16: $\kappa$. For a symmetric ground truth and the Root Mean Square Error

17: (RMSE) loss, we prove that the preconditioned SGD converges to $\epsilon$-accuracy

18: in $O(\log(1/\epsilon))$ iterations, with a rapid linear convergence

19: rate as if the ground truth were perfectly conditioned with $\kappa=1$.

20: In our experiments, we observe a similar acceleration for item-item

21: collaborative filtering on the MovieLens25M dataset via a pair-wise ranking loss,

22: with 100 million training pairs and 10 million testing pairs.

23: {[}See supporting code at \url{https://github.com/Hong-Ming/ScaledSGD}.{]}

24: %ill-conditioned matrix completion under the root mean square error (RMSE) loss,

25: %Euclidean distance matrix (EDM) completion under pairwise square

26: %loss.

27: \end{abstract}

28: