90f7adfff491d404.tex
1: \begin{abstract}
2:   Multitask learning is being increasingly adopted in applications domains like computer vision and reinforcement learning. However, {optimally exploiting its advantages} remains a major challenge due to the effect of negative transfer.
3:   %
4:   Previous works have tracked down this issue to the disparities in gradient magnitudes and directions across tasks when optimizing the shared network parameters. %
5:   %
6:   %
7: %
8: %
9:   While recent work has acknowledged that negative transfer is a two-fold problem, 
10:   existing approaches fall short. These methods only focus on either homogenizing the gradient magnitude across tasks;  or greedily change the gradient directions,  overlooking future conflicts.
11:   %
12:   In this work, we introduce \ours, an algorithm that tackles  negative transfer as a whole: 
13:   %
14: it jointly homogenizes gradient magnitudes and  directions,  while ensuring {training convergence}.
15:  %
16:   %
17:   %
18:   We show that \ours outperforms competing methods in complex problems, including multi-label classification in CelebA and computer vision tasks in the NYUv2 dataset.
19:   %
20:   A Pytorch implementation can be found in \url{https://github.com/adrianjav/rotograd}.
21:   %
22:   %
23:   %
24:   
25: %
26: \end{abstract}
27: