83cabb9805c37b43.tex
1: \begin{abstract}
2: This paper presents a novel value iteration (VI) algorithm for finding the optimal control for a kind of infinite-horizon  stochastic linear quadratic (SLQ) problem with unknown systems. First, an off-line algorithm is estabilished to obtain the optimal feedback control of our problem. Then, based on the off-line algorithm, the VI-based model-free algorithm and its convergence proof is provided. The main feature of the model-free algorithm is that a stabilizing control is not needed to initiate the algorithm. Finally, we validate our results with a simulation example.
3: \end{abstract}
4: