6dd3f3e60f6f96dd.tex
1: \begin{abstract}
2: Network representation learning (NRL) technique has been successfully adopted in various data mining and machine learning applications.
3: Random walk based NRL is one popular paradigm, which uses a set of random walks to capture the network structural information, 
4: and then employs word2vec models to learn the low-dimensional representations.
5: However, until now there is lack of a framework, which unifies existing random walk based NRL models 
6: and supports to efficiently learn from large networks.
7: The main obstacle comes from the diverse random walk models and the inefficient sampling method for the random walk generation.
8: In this paper, we first introduce a new and efficient edge sampler
9: based on Metropolis-Hastings sampling technique, 
10: and theoretically show the convergence property of the edge sampler to arbitrary discrete probability distributions.
11: Then we propose a random walk model abstraction,
12: in which users can easily define different transition probability by specifying dynamic edge weights and random walk states.
13: The abstraction is efficiently supported by our edge sampler, since our sampler can draw samples from unnormalized probability distribution in constant time complexity.
14: Finally, with the new edge sampler and random walk model abstraction, 
15: we carefully implement a scalable NRL framework called \sys.
16: We conduct comprehensive experiments with five random walk based NRL models over eleven real-world datasets,
17: and the results clearly demonstrate the efficiency of \sys over billion-edge networks. 
18: The code of \sys is released at: https://github.com/shaoyx/UniNet.
19: \end{abstract}
20: