1: \begin{abstract}
2: In this paper, we consider the state estimation problem for nonlinear discrete-time stochastic systems. We combine Lyapunov's method in control theory and deep reinforcement learning to design the state estimator. We theoretically prove the convergence of the bounded estimate error solely using the data simulated from the model. An actor-critic reinforcement learning algorithm is proposed to learn the state estimator approximated by a deep neural network. The convergence of the algorithm is analysed. The proposed Lyapunov-based reinforcement learning state estimator is compared with a number of existing nonlinear filtering methods through Monte Carlo simulations, showing its advantage in estimating convergence even under some system uncertainties such as covariance shift in system noise and randomly missing measurements. This is the first reinforcement learning-based nonlinear state estimator with bounded estimate error performance guarantee to the best of our knowledge.
3: \end{abstract}
4: