abstract:90f7adfff491d404.tex

1: \begin{abstract}

2:   Multitask learning is being increasingly adopted in applications domains like computer vision and reinforcement learning. However, {optimally exploiting its advantages} remains a major challenge due to the effect of negative transfer.

3:   %

4:   Previous works have tracked down this issue to the disparities in gradient magnitudes and directions across tasks when optimizing the shared network parameters. %

5:   %

6:   %

7: %

8: %

9:   While recent work has acknowledged that negative transfer is a two-fold problem,

10:   existing approaches fall short. These methods only focus on either homogenizing the gradient magnitude across tasks;  or greedily change the gradient directions,  overlooking future conflicts.

11:   %

12:   In this work, we introduce \ours, an algorithm that tackles  negative transfer as a whole:

13:   %

14: it jointly homogenizes gradient magnitudes and  directions,  while ensuring {training convergence}.

15:  %

16:   %

17:   %

18:   We show that \ours outperforms competing methods in complex problems, including multi-label classification in CelebA and computer vision tasks in the NYUv2 dataset.

19:   %

20:   A Pytorch implementation can be found in \url{https://github.com/adrianjav/rotograd}.

21:   %

22:   %

23:   %

24:

25: %

26: \end{abstract}

27: