1: \begin{abstract}
2: Characterizing large online social networks (OSNs)
3: through node querying is a challenging task.
4: OSNs often impose severe constraints on the query rate,
5: hence limiting the sample size to a small fraction of the total network.
6: Various ad-hoc subgraph sampling methods have been proposed,
7: but many of them give biased estimates
8: and no theoretical basis on the accuracy.
9: In this work, we focus on developing sampling methods for OSNs where querying
10: a node also reveals partial structural information about its neighbors.
11: Our methods are optimized for NoSQL graph databases
12: (if the database can be accessed directly),
13: or utilize Web API available on most major OSNs for graph sampling.
14: We show that our sampling method has provable convergence guarantees
15: on being an unbiased estimator,
16: and it is more accurate than current state-of-the-art methods.
17: We characterize metrics such as
18: node label density estimation and edge label density estimation,
19: two of the most fundamental network characteristics from which other network characteristics can be derived.
20: We evaluate our methods on-the-fly over several live networks using
21: their native APIs.
22: Our simulation studies over a variety of offline datasets show that
23: by including neighborhood information, our method drastically (4-fold) reduces the number of samples required
24: to achieve the same estimation accuracy of state-of-the-art methods.
25: \end{abstract}
26: