1: \begin{abstract}
2: Multitask learning is being increasingly adopted in applications domains like computer vision and reinforcement learning. However, {optimally exploiting its advantages} remains a major challenge due to the effect of negative transfer.
3: %
4: Previous works have tracked down this issue to the disparities in gradient magnitudes and directions across tasks when optimizing the shared network parameters. %
5: %
6: %
7: %
8: %
9: While recent work has acknowledged that negative transfer is a two-fold problem,
10: existing approaches fall short. These methods only focus on either homogenizing the gradient magnitude across tasks; or greedily change the gradient directions, overlooking future conflicts.
11: %
12: In this work, we introduce \ours, an algorithm that tackles negative transfer as a whole:
13: %
14: it jointly homogenizes gradient magnitudes and directions, while ensuring {training convergence}.
15: %
16: %
17: %
18: We show that \ours outperforms competing methods in complex problems, including multi-label classification in CelebA and computer vision tasks in the NYUv2 dataset.
19: %
20: A Pytorch implementation can be found in \url{https://github.com/adrianjav/rotograd}.
21: %
22: %
23: %
24:
25: %
26: \end{abstract}
27: