9e65b7c34ea3f237.tex
1: \begin{abstract}
2: % \noindent \def\thefootnote{*}\footnotetext{Equal Contribution.} \def\thefootnote{1}\footnotetext[1]{Gatsby Computational Unit, University College London, London.} \def\thefootnote{2}\footnotetext[2]{Department of Statistics, Columbia University, New York.} 
3: \noindent Many recent theoretical works on \emph{meta-learning} aim to achieve guarantees in leveraging similar representational structures from related tasks towards simplifying a target task. The main aim of theoretical guarantees on the subject is to establish the extent to which convergence rates---in learning a common representation---\emph{may scale with the number $N$ of tasks} (as well as the number of samples per task). 
4: First steps in this setting demonstrate this property when both the shared representation amongst tasks, and task-specific regression functions, are linear. This linear setting readily reveals the benefits of aggregating tasks, e.g., via averaging arguments. In practice, however, the representation is often highly nonlinear,
5: introducing nontrivial biases in each task that cannot easily be averaged out as in the linear case.\\ 
6: In the present work, we derive theoretical guarantees for meta-learning with nonlinear representations. In particular, assuming the shared nonlinearity maps to an infinite-dimensional reproducing kernel Hilbert space, we show that additional biases can be mitigated with careful regularization that leverages the smoothness of task-specific regression functions, yielding improved rates that scale with the number of tasks as desired.      
7: \end{abstract}
8: