1: \begin{abstract}
2:
3: Anchor-based techniques reduce the computational
4: complexity of spectral clustering algorithms.
5: Although empirical tests have shown promising results, there is currently a lack of theoretical support for the anchoring approach.
6: We define a specific anchor-based algorithm and show that it
7: is amenable to rigorous analysis, as well as being effective in practice.
8: We establish the theoretical consistency of the method
9: in an asymptotic setting where data is sampled
10: from an underlying continuous probability distribution.
11: In particular, we provide sharp asymptotic conditions for the algorithm parameters
12: which ensure that
13: the anchor-based method can recover with high probability
14: disjoint clusters that are mutually separated by a positive distance.
15: We illustrate the
16: performance of the algorithm on synthetic
17: data and explain how the
18: theoretical convergence analysis can be used to inform the
19: practical choice of parameter scalings.
20: We also test the accuracy and efficiency of the algorithm on two large scale real data sets.
21: We find that the algorithm offers clear advantages over standard spectral clustering.
22: We also find that it
23: is competitive with the state-of-the-art
24: LSC method of Chen and Cai (Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011), while having the added
25: benefit of a consistency guarantee.
26:
27: \end{abstract}
28: