abstract:d275603dab592d4f.tex

1: \begin{abstract} % plz nie wstawiajcie poprawek językowych bez TODO %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2: Transfer learning is a popular technique used to improve the initialization of neural networks. However, there is currently no algorithm that allows for the transfer of parameters between networks with different architectures. In this study, we introduce inter-architecture knowledge transfer (IAT) and a computationally efficient method called \algo{} that enables the transfer of parameters between different architectures. Using dynamic programming, \algo{} automatically splits the architecture into abstract blocks and then matches the blocks and layers within. Once matches are found, the parameters are transformed and transferred. Our primary goal is to speed up the training of neural networks from scratch, and we show that IAT is superior to existing parameter prediction and random initialization methods. \algo{} improves validation accuracy by an average of 1.6 times after 4 epochs of training on CIFAR100 compared to random initialization. Our method allows both manual users and neural architecture search (NAS) systems to modify trained networks and reuse the knowledge to avoid retraining from scratch, speeding up the convergence of the network. We also provide a new network architecture similarity measure that correlates with the effectiveness of \algo{}, allowing users to choose the source network without any training.

3: \end{abstract}