1: \begin{abstract}We consider the paradigm of unsupervised anomaly detection, which involves the identification of anomalies within a dataset in the absence of labeled examples.
2: Though distance-based methods are top-performing for unsupervised anomaly detection, they suffer heavily from the sensitivity to the choice of the number of the nearest neighbors.
3: In this paper, we propose a new distance-based algorithm called \textit{bagged regularized $k$-distances for anomaly detection} (\textit{BRDAD}) converting the unsupervised anomaly detection problem into a convex optimization problem.
4: Our BRDAD algorithm selects the weights by minimizing the \textit{surrogate risk}, i.e., the finite sample bound of the empirical risk of the \textit{bagged weighted $k$-distances for density estimation} (\textit{BWDDE}).
5: This approach enables us to successfully address the sensitivity challenge of the hyperparameter choice in distance-based algorithms.
6: Moreover, when dealing with large-scale datasets, the efficiency issues can be addressed by the incorporated bagging technique in our BRDAD algorithm.
7: On the theoretical side, we establish fast convergence rates of the AUC regret of our algorithm and demonstrate that the bagging technique significantly reduces the computational complexity.
8: On the practical side, we conduct numerical experiments on anomaly detection benchmarks to illustrate the insensitivity of parameter selection of our algorithm compared with other state-of-the-art distance-based methods. Moreover, promising improvements are brought by applying the bagging technique in our algorithm on real-world datasets.
9: \end{abstract}
10: