1: \begin{abstract}
2: We consider a reinforcement learning framework where agents have to navigate from start states to goal states.
3: We prove convergence of a cycle-detection learning algorithm on a class of tasks that we call reducible.
4: Reducible tasks have an acyclic solution.
5: We also syntactically characterize the form of the final policy. This characterization can be used to precisely detect the convergence point in a simulation.
6: Our result demonstrates that even simple algorithms can be successful in learning a large class of nontrivial tasks.
7: In addition, our framework is elementary in the sense that we only use basic concepts to formally prove convergence.
8: %\keywords{reinforcement learning \and graph \and navigation \and convergence}
9: \end{abstract}
10: