1: \begin{abstract}
2: While spectral embedding is a widely applied dimension reduction technique in various fields, so far it is still challenging to make it {\em scalable} and {\em robust} to handle ``big data''.
3: %
4: Motivated by the need of handling such data, we propose a novel spectral embedding algorithm, which we coined {\em Robust and Scalable Embedding via Landmark Diffusion} (ROSELAND). In short, we measure the affinity between two points via a set of landmarks, which is composed of a small number of points, and ``diffuse'' on the dataset via the landmark set to achieve a spectral embedding. The embedding is not only scalable and robust, but also preserves the geometric properties under the manifold setup.
5: %
6: The Roseland can be viewed as a generalization of the commonly applied spectral embedding algorithm, the {\em diffusion map} (DM), in the sense that it shares various properties of the DM.
7: %
8: In addition to providing a theoretical justification of the Roseland under the manifold setup, including handling the U-statistics-like quantities, providing a $L^\infty$ spectral convergence with a rate, and offering a high dimensional noise analysis, we show various numerical simulations and compare the Roseland with other existing algorithms.
9:
10: {\bf keywords:}
11: Graph Laplacian, Diffusion Maps, Nystr\"om, Landmark, Scalability, Robustness, Roseland
12:
13: \end{abstract}
14: