fa8ea106c86f59cb.tex
1: $, for ensuring the convergence of Q-learning, the learning rate $