abstract:23622922de9286db.tex

1: \begin{abstract}

2: This paper studies an infinite horizon optimal control problem for

3: discrete-time linear systems and quadratic criteria, both with random

4: parameters which are independent and identically distributed with

5: respect to time. A classical approach is to solve an algebraic Riccati

6: equation that involves mathematical expectations and requires certain

7: statistical information of the parameters. In this paper, we propose

8: an online iterative algorithm in the spirit of Q-learning for the

9: situation where only one random sample of parameters emerges at each time step.

10: The first theorem proves the equivalence

11: of three properties: the convergence of the learning sequence, the

12: well-posedness of the control problem, and the solvability of the

13: algebraic Riccati equation. The second theorem shows that the adaptive

14: feedback control in terms of the learning sequence stabilizes the

15: system as long as the control problem is well-posed. Numerical examples

16: are presented to illustrate our results.

17: \end{abstract}