bfedb0a9d3a6a7f8.tex
1: \begin{abstract}
2:   In this paper a novel model-free algorithm is proposed. This algorithm can learn 
3:   the nearly optimal control law of constrained-input systems from online data without 
4:   requiring any \emph{a priori} knowledge of system dynamics. Based on the concept of 
5:   generalized policy iteration method, there are two neural networks (NNs), namely actor and 
6:   critic NN to approximate the optimal value function and optimal policy. The stability 
7:   of closed-loop systems and the convergence of weights are also guaranteed by Lyapunov 
8:   analysis.
9: \end{abstract}
10: