d1e3818f1d58810c.tex
1: \begin{abstract}
2: The Lottery Ticket hypothesis proposes that ideal, sparse subnetworks, called lottery tickets, exist in the untrained dense network. 
3: The Early Bird hypothesis proposes an efficient algorithm to find these winning lottery tickets in convolutional neural networks, using the novel concept of 
4: distance between subnetworks to detect convergence in the subnetworks of a model. 
5: However, this approach overlooks unchanging groups of unimportant neurons near the search's end. We propose WORM, a method that exploits 
6: these static groups by truncating their gradients, forcing the model to rely on other neurons. 
7: Experiments show WORM achieves faster ticket identification, training, and uses fewer FLOPs despite the additional computational overhead. 
8: Additionally, WORM-pruned models lose less accuracy during pruning and recover accuracy faster, improving the robustness of the model.
9: Furthermore, WORM is also able to generalize the Early Bird hypothesis reasonably well to larger models, such as transformers, 
10: displaying its flexibility to adapt to various architectures.
11: \end{abstract}
12: