1: \begin{abstract}
2: Collaborative filtering algorithms are important building blocks in many practical recommendation systems.
3: For example, many large-scale data processing environments include collaborative filtering models for which the Alternating Least Squares (ALS) algorithm is used to compute latent factor matrix decompositions.
4: In this paper, we propose an approach to accelerate the convergence of parallel ALS-based optimization methods for collaborative filtering using a nonlinear conjugate gradient (NCG) wrapper around the ALS iterations.
5: We also provide a parallel implementation of the accelerated
6: ALS-NCG algorithm in the Apache Spark distributed data
7: processing environment, and an efficient line search
8: technique as part of the ALS-NCG implementation that requires only one pass over the data on distributed datasets.
9: In serial numerical experiments on a linux workstation and
10: parallel numerical experiments on a 16 node cluster with 256
11: computing cores, we demonstrate that the combined ALS-NCG
12: method requires many fewer iterations and less time than
13: standalone ALS to reach movie rankings with high accuracy on
14: the MovieLens 20M dataset.
15: In parallel, ALS-NCG can achieve an acceleration factor of 4
16: or greater in clock time when an accurate solution is
17: desired; furthermore, the acceleration factor increases
18: as greater numerical precision is required in the
19: solution.
20: In addition, the NCG acceleration mechanism is efficient
21: in parallel and scales linearly with problem size on
22: synthetic datasets with up to nearly 1 billion ratings.
23: The acceleration mechanism is general and may also be
24: applicable to other optimization methods for collaborative
25: filtering.
26: \end{abstract}
27: