abstract:8f8c9df6e8a58983.tex

1: \begin{abstract}

2: We analyse the regret arising from learning the price sensitivity parameter $\kappa$ of liquidity takers in the ergodic version of the Avellaneda--Stoikov market making model.

3: We show that a learning algorithm based on a regularised maximum-likelihood estimator for the parameter achieves the regret upper bound of order $\ln^2 T$ in expectation.

4: To obtain the result we need two key ingredients.

5: The first are tight upper bounds on the derivative of the ergodic constant in the  Hamilton--Jacobi--Bellman (HJB) equation with respect to $\kappa$.

6: The second is the learning rate of the maximum-likelihood estimator which is obtained from concentration inequalities for Bernoulli signals.

7: Numerical experiment confirms the convergence and the robustness of the proposed algorithm.

8: \end{abstract}

9: