256623cdc7c2dcf7.tex
1: \begin{abstract}
2:     In object detection, multi-level prediction (e.g., FPN) and reweighting skills (e.g., focal loss) have drastically improved one-stage detector performance. However, the synergy between these two techniques is not fully explored in a unified framework. We find that, during training, the one-stage detector's optimization is not only restricted to the static hard-case mining loss (\emph{gradient drift}), but also suffered from the diverse positive samples' proportions split by different pyramid levels (\emph{level discrepancy}). Under this concern, we propose Hierarchical Progressive Focus (HPF) consisting of two key designs: 1) \emph{progressive focus}, a more flexible hard-case mining setting calculated adaptive to the convergence progress, 2) \emph{hierarchical sampling}, automatically generating a set of progressive focus for level-specific target optimization. Based on focal loss with ATSS-R50, our approach achieves 40.5 AP, surpassing the state-of-the-art QFL (Quality Focal Loss, 39.9 AP) and VFL (Varifocal Loss, 40.1 AP). Our best model achieves \textbf{55.1} AP on COCO \emph{test-dev}, obtaining excellent results with only a typical training setting. Moreover, as a plug-and-play scheme, HPF can cooperate well with recent advances, providing a stable performance improvement on \textbf{9} mainstream detectors.
3: \end{abstract}
4: