227212c3fdb65d33.tex
1: \begin{abstract}
2: 	% The challenge
3: 	One major limitation to the applicability of Reinforcement Learning (RL)
4: 	to many practical domains is the large number of samples
5: 	required to learn an optimal policy.
6: 	% The setting
7: 	To address this problem and improve learning efficiency, 
8: 	we consider a linear hierarchy of abstraction layers
9: 	of the Markov Decision Process (MDP) underlying the target domain.
10: 	Each layer is an MDP representing a coarser model of the one immediately below in the hierarchy.
11: 	% The solution idea
12: 	In this work, we propose a novel form of Reward Shaping where 
13: 	the solution obtained at the abstract level is used to offer
14: 	rewards to the more concrete MDP, in such a way that
15: 	the abstract solution guides the learning in the more complex domain.
16: 	% Good properties
17: 	In contrast with other works in Hierarchical RL, our technique has few requirements in the 
18: 	design of the abstract models and it is also tolerant to modeling errors, thus making the
19: 	proposed approach practical.
20: 	% Supporting our claims
21: 	We formally analyze the relationship between the abstract models and 
22: 	the exploration heuristic induced in the lower-level domain.
23: 	Moreover, we prove that the method guarantees optimal convergence
24: 	and we demonstrate its effectiveness experimentally.
25: \end{abstract}
26: