abstract:2e1d761fa739ddb7.tex

1: \begin{abstract}

2: Recent advances in meta-learning demonstrate that deep representations combined with

3: the gradient descent method have sufficient capacity to approximate any learning algorithm.

4: A promising approach is model-agnostic meta-learning (MAML) which embeds gradient descent into the meta-learner.

5: It optimizes for the initial parameters of the learner to warm-start the gradient descent updates,

6: such that new tasks can be solved using a small number of examples.

7: In this paper we elaborate the gradient-based meta-learning, developing two new schemes.

8: First, we present a feedforward neural network, referred to as {\em T-net}, where the linear transformation

9: between two adjacent layers is decomposed as $\T \W$ such that $\W$ is learned by task-specific learners

10: and the transformation $\T$, which is shared across tasks, is meta-learned to speed up the convergence of gradient updates for task-specific learners.

11: Second, we present {\em MT-net} where gradient updates in the T-net are guided by a binary mask $\M$ that

12: is meta-learned, restricting the updates to be performed in a subspace.

13: Empirical results demonstrate that our method is less sensitive to the choice of initial learning rates than existing meta-learning methods,

14: and achieves the state-of-the-art or comparable performance on few-shot classification and regression tasks.

15: \end{abstract}

16: