1: \begin{abstract}
2: % The challenge
3: One major limitation to the applicability of Reinforcement Learning (RL)
4: to many practical domains is the large number of samples
5: required to learn an optimal policy.
6: % The setting
7: To address this problem and improve learning efficiency,
8: we consider a linear hierarchy of abstraction layers
9: of the Markov Decision Process (MDP) underlying the target domain.
10: Each layer is an MDP representing a coarser model of the one immediately below in the hierarchy.
11: % The solution idea
12: In this work, we propose a novel form of Reward Shaping where
13: the solution obtained at the abstract level is used to offer
14: rewards to the more concrete MDP, in such a way that
15: the abstract solution guides the learning in the more complex domain.
16: % Good properties
17: In contrast with other works in Hierarchical RL, our technique has few requirements in the
18: design of the abstract models and it is also tolerant to modeling errors, thus making the
19: proposed approach practical.
20: % Supporting our claims
21: We formally analyze the relationship between the abstract models and
22: the exploration heuristic induced in the lower-level domain.
23: Moreover, we prove that the method guarantees optimal convergence
24: and we demonstrate its effectiveness experimentally.
25: \end{abstract}
26: