abstract:a8da6c81eb63716f.tex

1: \begin{abstract}

2: 	We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks.

3: 	We introduce a novel metric between Markov Decision Processes and establish that close MDPs have close optimal value functions.

4: 	Formally, the optimal value functions are Lipschitz continuous with respect to the tasks space.

5: 	These theoretical results lead us to a value-transfer method for \lrl{}, which we use to build a PAC-MDP algorithm with improved convergence rate.

6: 	Further, we show the method to experience no negative transfer with high probability.

7: 	We illustrate the benefits of the method in \lrl{} experiments.

8: \end{abstract}

9: