a8da6c81eb63716f.tex
1: \begin{abstract}	
2: 	We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks.
3: 	We introduce a novel metric between Markov Decision Processes and establish that close MDPs have close optimal value functions.
4: 	Formally, the optimal value functions are Lipschitz continuous with respect to the tasks space.
5: 	These theoretical results lead us to a value-transfer method for \lrl{}, which we use to build a PAC-MDP algorithm with improved convergence rate.
6: 	Further, we show the method to experience no negative transfer with high probability.
7: 	We illustrate the benefits of the method in \lrl{} experiments.
8: \end{abstract}
9: