1: \begin{abstract}
2: We study the convergence of $q$-learning and related algorithms introduced by
3: Jia and Zhou (J. Mach. Learn. Res., 24 (2023), 161) for controlled diffusion processes.
4: Under suitable conditions on the growth and regularity of the model parameters,
5: we provide a quantitative error and regret analysis of both
6: the exploratory policy improvement algorithm
7: and the $q$-learning algorithm.
8: \end{abstract}
9: