abstract:4c10369d58b3b718.tex

1: \begin{abstract}

2: We study the convergence of $q$-learning and related algorithms introduced by

3: Jia and Zhou (J. Mach. Learn. Res., 24 (2023), 161) for controlled diffusion processes.

4: Under suitable conditions on the growth and regularity of the model parameters,

5: we provide a quantitative error and regret analysis of both

6: the exploratory policy improvement algorithm

7: and the $q$-learning algorithm.

8: \end{abstract}

9: