ae419020de0daaeb.tex
1: \begin{abstract}    
2:     We consider a reinforcement learning framework where agents have to navigate from start states to goal states.
3:     We prove convergence of a cycle-detection learning algorithm on a class of tasks that we call reducible.
4:     Reducible tasks have an acyclic solution.
5:     We also syntactically characterize the form of the final policy. This characterization can be used to precisely detect the convergence point in a simulation.
6:     Our result demonstrates that even simple algorithms can be successful in learning a large class of nontrivial tasks.
7:     In addition, our framework is elementary in the sense that we only use basic concepts to formally prove convergence.
8:     %\keywords{reinforcement learning \and graph \and navigation \and convergence}
9: \end{abstract}
10: