abstract:32c678c014487c56.tex

1: \begin{abstract}

2: Owing to its application in solving the difficult and diverse clustering

3: or outlier detection problem, support-based clustering has recently

4: drawn plenty of attention. Support-based clustering method always

5: undergoes two phases: finding the domain of novelty and performing

6: clustering assignment. To find the domain of novelty, the training

7: time given by the current solvers is typically over-quadratic in the

8: training size, and hence precluding the usage of support-based clustering

9: method for large-scale datasets. In this paper, we propose applying

10: Stochastic Gradient Descent (SGD) framework to the first phase of

11: support-based clustering for finding the domain of novelty and a new

12: strategy to perform the clustering assignment. However, the direct

13: application of SGD to the first phase of support-based clustering

14: is vulnerable to the curse of kernelization, that is, the model size

15: linearly grows up with the data size accumulated overtime. To address

16: this issue, we invoke the budget approach which allows us to restrict

17: the model size to a small budget. Our new strategy for clustering

18: assignment enables a fast computation by means of reducing the task

19: of clustering assignment on the full training set to the same task

20: on a significantly smaller set. We also provide a rigorous theoretical

21: analysis about the convergence rate for the proposed method. Finally,

22: we validate our proposed method on the well-known datasets for clustering

23: to show that the proposed method offers a comparable clustering quality

24: while simultaneously achieving significant speedup in comparison with

25: the baselines.\end{abstract}