abstract:97cb753e33c7c78f.tex

1: \begin{abstract}

2: We reduce the cost of communication and synchronization in graph processing

3: by analyzing the fastest way to process graphs:

4: pushing the updates to a shared state or pulling the updates to

5: a private state.

6: We investigate the applicability of this push-pull dichotomy to various algorithms

7: and its impact on complexity, performance, and the amount of used locks, atomics, and reads/writes.

8: We consider 11 graph algorithms, 3 programming

9: models, 2 graph abstractions, and various families

10: of graphs.

11: The conducted analysis illustrates surprising differences between push and pull

12: variants of different algorithms in performance, speed of convergence, and code

13: complexity; the insights are backed up by performance data from

14: hardware counters. We use these findings to illustrate which variant is faster

15: for each algorithm and to develop generic strategies that enable even

16: higher speedups.

17: Our insights can be used to accelerate graph processing engines or libraries on

18: both massively-parallel shared-memory machines as well as distributed-memory

19: systems.

20: \end{abstract}

21: