4ef9b176411875e1.tex
1: \begin{abstract}
2: 
3: 
4: \noindent
5: Kernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0(  \frac{ \| x_i - x_j \|^2} {\sigma^2} ) $
6: is widely used in graph-based geometric data analysis
7: and unsupervised learning. 
8: An important question is how to choose the kernel bandwidth $\sigma$,
9: and a common practice called self-tuned kernel adaptively sets a $\sigma_i$ at each point $x_i$ by the $k$-nearest neighbor (kNN) distance.
10: When $x_i$'s are sampled from a $d$-dimensional manifold embedded in a possibly high-dimensional space,
11: unlike with fixed-bandwidth kernels,
12: theoretical results of graph Laplacian convergence with self-tuned kernels
13: %\old{,however,}
14: have been incomplete. 
15: This paper proves the convergence of graph Laplacian operator $L_N$
16: to manifold (weighted-)Laplacian for
17: a new family of kNN self-tuned kernels $W^{(\alpha)}_{ij}
18:  = k_0( \frac{  \| x_i - x_j \|^2}{ \epsilon \hat{\rho}(x_i) \hat{\rho}(x_j)})/\hat{\rho}(x_i)^\alpha \hat{\rho}(x_j)^\alpha$, 
19: where $\hat{\rho}$ is the estimated bandwidth function {by kNN},
20: and the limiting operator is also parametrized by $\alpha$.
21: When $\alpha = 1$, the limiting operator is the weighted manifold Laplacian $\Delta_p$.
22: Specifically, we prove the  point-wise  convergence of $L_N f $ and convergence of the graph Dirichlet form with rates.
23: Our analysis is based on first establishing 
24: a $C^0$ consistency for 
25: $\hat{\rho}$
26: which bounds the relative estimation error  $|\hat{\rho} - \bar{\rho}|/\bar{\rho}$ uniformly 
27: with high probability, 
28: where $\bar{\rho} = p^{-1/d}$, and $p$ is the data density function. 
29: Our theoretical results reveal the advantage of self-tuned kernel over fixed-bandwidth kernel 
30: via smaller variance error in low-density regions.
31: In the algorithm, no prior knowledge of $d$ or data density is needed. 
32: The theoretical results are supported by numerical experiments  on simulated  data and hand-written digit image data.
33: \end{abstract}
34: