1988bb21de2018b5.tex
1: \begin{abstract}
2: 
3: The popularity of online social networks (OSNs) has given rise to a number of measurement studies that provide a first step towards their understanding.
4: So far, such studies have been based either on complete data sets provided directly by the OSN itself or on Breadth-First-Search (BFS) crawling of the social graph, which does not
5: guarantee good statistical properties of the collected sample. In this paper, we crawl the publicly available social graph and present the first unbiased sampling of Facebook (FB) users using a Metropolis-Hastings random walk with multiple chains.
6: We study the convergence properties of the walk and demonstrate the uniformity of the collected sample with respect
7: to multiple metrics of interest. We provide a comparison of our crawling technique to baseline algorithms, namely BFS and simple random walk, as well as to the ``ground truth'' obtained through truly uniform sampling of userIDs.
8: %
9: Our contributions lie both
10: in the measurement methodology and in the collected sample. With regards to the methodology, our measurement technique (i) applies and combines known results from random
11: walk sampling specifically in the OSN context and (ii) addresses system implementation aspects that have made the measurement of Facebook challenging so far.
12: %
13: %
14: With respect to the collected sample: (i) it is the first representative sample of FB users and we plan to make it publicly available; (ii) we perform a characterization of several key properties of the data set, and find that some of them are substantially different from what was previously believed based on non-representative OSN samples.
15: \end{abstract}
16: