abstract:6394c52ddba9fe64.tex

1: \begin{abstract}

2: % Compression

3: Summarizing large-scaled directed graphs into small-scale representations is a useful but less studied problem setting. Conventional clustering approaches, which based on ``Min-Cut"-style criteria, compress both the vertices and edges of the graph into the communities, that lead to a loss of directed edge information. On the other hand, compressing the vertices while preserving the directed edge information provides a way to learn the small-scale representation of a directed graph.

4: % Reconstruction

5: The \textit{reconstruction error}, which measures the edge information preserved by the summarized graph, can be used to learn such representation.

6: %

7: Compared to the original graphs, the summarized graphs are easier to analyze and are capable of extracting group-level features which is useful for efficient interventions of population behavior.

8: In this paper, we present a model, based on minimizing \textit{reconstruction error} with non-negative constraints, which relates to a ``Max-Cut" criterion that simultaneously identifies the compressed nodes and the directed compressed relations between these nodes.

9: A multiplicative update algorithm with column-wise normalization is proposed.

10: We further provide theoretical results on the identifiability of the model and on the convergence of the proposed algorithms.

11: Experiments are conducted to demonstrate the accuracy and robustness of the proposed method.

12: \end{abstract}

13: