abstract:227212c3fdb65d33.tex

1: \begin{abstract}

2: 	% The challenge

3: 	One major limitation to the applicability of Reinforcement Learning (RL)

4: 	to many practical domains is the large number of samples

5: 	required to learn an optimal policy.

6: 	% The setting

7: 	To address this problem and improve learning efficiency,

8: 	we consider a linear hierarchy of abstraction layers

9: 	of the Markov Decision Process (MDP) underlying the target domain.

10: 	Each layer is an MDP representing a coarser model of the one immediately below in the hierarchy.

11: 	% The solution idea

12: 	In this work, we propose a novel form of Reward Shaping where

13: 	the solution obtained at the abstract level is used to offer

14: 	rewards to the more concrete MDP, in such a way that

15: 	the abstract solution guides the learning in the more complex domain.

16: 	% Good properties

17: 	In contrast with other works in Hierarchical RL, our technique has few requirements in the

18: 	design of the abstract models and it is also tolerant to modeling errors, thus making the

19: 	proposed approach practical.

20: 	% Supporting our claims

21: 	We formally analyze the relationship between the abstract models and

22: 	the exploration heuristic induced in the lower-level domain.

23: 	Moreover, we prove that the method guarantees optimal convergence

24: 	and we demonstrate its effectiveness experimentally.

25: \end{abstract}

26: