1: \begin{abstract}
2: The Lottery Ticket hypothesis proposes that ideal, sparse subnetworks, called lottery tickets, exist in the untrained dense network.
3: The Early Bird hypothesis proposes an efficient algorithm to find these winning lottery tickets in convolutional neural networks, using the novel concept of
4: distance between subnetworks to detect convergence in the subnetworks of a model.
5: However, this approach overlooks unchanging groups of unimportant neurons near the search's end. We propose WORM, a method that exploits
6: these static groups by truncating their gradients, forcing the model to rely on other neurons.
7: Experiments show WORM achieves faster ticket identification, training, and uses fewer FLOPs despite the additional computational overhead.
8: Additionally, WORM-pruned models lose less accuracy during pruning and recover accuracy faster, improving the robustness of the model.
9: Furthermore, WORM is also able to generalize the Early Bird hypothesis reasonably well to larger models, such as transformers,
10: displaying its flexibility to adapt to various architectures.
11: \end{abstract}
12: