abstract:16be09b0a3d790a1.tex

1: \begin{abstract}

2:   For complex, high-dimensional Markov Decision Processes (MDPs), it may be necessary to represent the policy with function approximation. A problem is misspecified whenever, the representation cannot express any policy with acceptable performance. We introduce \Alg\ : an approach for solving misspecified problems. \Alg\ iteratively learns a set of context specialized options and combines these options to solve an otherwise misspecified problem. Our main contribution is proving that \Alg\ enjoys theoretical convergence guarantees. In addition, we extend \Alg\ to exploit Option Interruption (OI) enabling it to decide where the learned options can be reused. Our experiments demonstrate that \Alg\ can find near-optimal solutions to otherwise misspecified problems and that OI can further improve the solutions.

3: \end{abstract}

4: