32c678c014487c56.tex
1: \begin{abstract}
2: Owing to its application in solving the difficult and diverse clustering
3: or outlier detection problem, support-based clustering has recently
4: drawn plenty of attention. Support-based clustering method always
5: undergoes two phases: finding the domain of novelty and performing
6: clustering assignment. To find the domain of novelty, the training
7: time given by the current solvers is typically over-quadratic in the
8: training size, and hence precluding the usage of support-based clustering
9: method for large-scale datasets. In this paper, we propose applying
10: Stochastic Gradient Descent (SGD) framework to the first phase of
11: support-based clustering for finding the domain of novelty and a new
12: strategy to perform the clustering assignment. However, the direct
13: application of SGD to the first phase of support-based clustering
14: is vulnerable to the curse of kernelization, that is, the model size
15: linearly grows up with the data size accumulated overtime. To address
16: this issue, we invoke the budget approach which allows us to restrict
17: the model size to a small budget. Our new strategy for clustering
18: assignment enables a fast computation by means of reducing the task
19: of clustering assignment on the full training set to the same task
20: on a significantly smaller set. We also provide a rigorous theoretical
21: analysis about the convergence rate for the proposed method. Finally,
22: we validate our proposed method on the well-known datasets for clustering
23: to show that the proposed method offers a comparable clustering quality
24: while simultaneously achieving significant speedup in comparison with
25: the baselines.\end{abstract}