e4d499a50d200594.tex
1: \begin{abstract}
2: Significant computational resources are required to train Graph Neural Networks (GNNs) at a large scale,
3: and the process is highly data-intensive.
4: One of the most effective ways to reduce resource requirements is minibatch training 
5: coupled with graph sampling.
6: GNNs have the unique property that items in a minibatch have overlapping data. 
7: However, the commonly implemented Independent Minibatching approach assigns each Processing 
8: Element (PE) its own minibatch to process, leading to duplicated computations and input data access across PEs. 
9: This amplifies the Neighborhood Explosion Phenomenon (NEP), which is the main bottleneck limiting scaling. 
10: To reduce the effects of NEP in the multi-PE setting,
11: we propose a new approach called Cooperative Minibatching. 
12: Our approach capitalizes on the fact that the size of the sampled subgraph is a concave function of the batch size, leading to 
13: significant reductions in the amount of work per seed vertex as batch sizes increase. Hence, it is favorable for 
14: processors equipped with a fast interconnect to work on a large minibatch together as a single larger processor, instead of working on separate smaller 
15: minibatches, even though global batch size is identical.
16: We also show how to take advantage of the same phenomenon in serial execution by generating dependent consecutive minibatches. 
17: Our experimental evaluations show up to 4x bandwidth savings for fetching vertex embeddings, by simply increasing 
18: this dependency without harming model convergence. Combining our proposed approaches, we achieve up to 64\% 
19: speedup over Independent Minibatching on single-node multi-GPU systems.
20: % UVC: why we had this?
21: %and show that load balancing is not an issue despite the use of lock-step communication.
22: \end{abstract}
23: