c249ac28e46d202c.tex
1: \begin{abstract}
2:     Graphlets are induced subgraph patterns and  have been frequently applied to
3:     characterize the local topology structures of graphs across various
4:     domains, e.g., online social networks (OSNs) and biological networks.
5:     Discovering and computing graphlet statistics are highly challenging.
6:     First, the massive size of real-world graphs makes the exact computation of
7:     graphlets extremely expensive.  Secondly, the graph topology may not be
8:     readily available so one has to resort to web crawling using the available
9:     application programming interfaces (APIs).
10:     In this work, we propose a general and novel framework to estimate graphlet
11:     statistics of ``{\em any size}''. Our framework is based on collecting
12:     samples through consecutive steps of random walks. We derive an analytical
13:     bound on the sample size (via the Chernoff-Hoeffding technique) to
14:     guarantee the convergence of our unbiased estimator. 
15:     To further improve the accuracy, we introduce two novel optimization techniques
16:     to reduce the lower bound on the sample size.  Experimental evaluations
17:     demonstrate that our methods outperform the state-of-the-art method up to
18:     an order of magnitude both in terms of accuracy and time cost.
19: \end{abstract}
20: