b6fa09026a39e3f7.tex
1: \begin{abstract}
2: We consider the Monte-Carlo first visit algorithm, of which the goal is to find the optimal control in
3: a Markov decision process with finite state space and finite number of possible actions.
4: We show its convergence when the discount factor is smaller than $1/2$.
5: \end{abstract}
6: