1: \begin{abstract}
2: Siamese-network-based self-supervised learning (SSL) suffers from slow convergence
3: and instability in training.
4: To alleviate this, we propose a framework to exploit intermediate self-supervisions in each stage of deep nets, called the {\it Ladder Siamese Network}.
5: Our self-supervised losses encourage the intermediate layers
6: to be consistent with different data augmentations to single samples, which facilitates training progress and enhances the discriminative ability of the intermediate layers themselves.
7: While some existing work has already utilized multi-level self supervisions in SSL, ours is different in that 1) we reveal its usefulness with non-contrastive Siamese frameworks in both theoretical and empirical viewpoints, and 2) ours improves image-level classification, instance-level detection, and pixel-level segmentation simultaneously.
8: Experiments show that the proposed framework can improve BYOL baselines by 1.0\% points in ImageNet linear classification, 1.2\% points in COCO detection,
9: and 3.1\% points in PASCAL VOC segmentation.
10: In comparison with the state-of-the-art methods, our Ladder-based model achieves competitive and balanced performances in all tested benchmarks without causing large degradation in one.
11: \end{abstract}
12: