1: \begin{abstract}
2: The stable combination of optimal feedback policies with online learning
3: is studied in a new control-theoretic framework for uncertain nonlinear systems.
4: The framework can be systematically used in transfer learning and sim-to-real applications, where an optimal policy learned for a nominal system needs to remain effective in the presence of significant variations in parameters.
5: Given unknown parameters within a bounded range, the resulting adaptive control laws guarantee convergence of the closed-loop system to the state of zero cost.
6: Online adjustment of the learning rate is used as a key stability mechanism, and preserves certainty equivalence when designing optimal policies without assuming uncertainty to be within the control range.
7: The approach is illustrated on the familiar mountain car problem, where it yields near-optimal performance despite the presence of parametric model uncertainty.
8: \end{abstract}
9: