1: \begin{abstract}
2: Significant computational resources are required to train Graph Neural Networks (GNNs) at a large scale,
3: and the process is highly data-intensive.
4: One of the most effective ways to reduce resource requirements is minibatch training
5: coupled with graph sampling.
6: GNNs have the unique property that items in a minibatch have overlapping data.
7: However, the commonly implemented Independent Minibatching approach assigns each Processing
8: Element (PE) its own minibatch to process, leading to duplicated computations and input data access across PEs.
9: This amplifies the Neighborhood Explosion Phenomenon (NEP), which is the main bottleneck limiting scaling.
10: To reduce the effects of NEP in the multi-PE setting,
11: we propose a new approach called Cooperative Minibatching.
12: Our approach capitalizes on the fact that the size of the sampled subgraph is a concave function of the batch size, leading to
13: significant reductions in the amount of work per seed vertex as batch sizes increase. Hence, it is favorable for
14: processors equipped with a fast interconnect to work on a large minibatch together as a single larger processor, instead of working on separate smaller
15: minibatches, even though global batch size is identical.
16: We also show how to take advantage of the same phenomenon in serial execution by generating dependent consecutive minibatches.
17: Our experimental evaluations show up to 4x bandwidth savings for fetching vertex embeddings, by simply increasing
18: this dependency without harming model convergence. Combining our proposed approaches, we achieve up to 64\%
19: speedup over Independent Minibatching on single-node multi-GPU systems.
20: % UVC: why we had this?
21: %and show that load balancing is not an issue despite the use of lock-step communication.
22: \end{abstract}
23: