abstract:b504f820c93dd5f3.tex

1: \begin{abstract}

2:     This article introduces a novel framework for data-driven linear quadratic regulator (LQR) design.

3:     First, we introduce a reinforcement learning paradigm for on-policy data-driven LQR, where exploration and exploitation are simultaneously performed while guaranteeing robust stability of the whole closed-loop system encompassing the plant and the control/learning dynamics.

4:     Then, we propose Model Reference Adaptive Reinforcement Learning (MR-ARL), a control architecture integrating tools from reinforcement learning and model reference adaptive control.

5:     The approach stands on a variable reference model containing the currently identified value function.

6:     Then, an adaptive stabilizer is used to ensure convergence of the applied policy to the optimal one, convergence of the plant to the optimal reference model, and overall robust closed-loop stability.

7:     The proposed framework provides theoretical robustness certificates against real-world perturbations such as measurement noise, plant nonlinearities, or slowly varying parameters.

8:     The effectiveness of the proposed architecture is validated via realistic numerical simulations.

9: \end{abstract}

10: