abstract:d6e6e09aa3c6e869.tex

1: \begin{abstract}

2: In many real-world scenarios involving high-stakes and safety implications, a human decision-maker (HDM) may receive recommendations from an artificial intelligence while holding the ultimate responsibility of making decisions. %This protocol involving both an autonomous agent and an HDM is known as ``expert in the loop." %where an HDM receives recommendations from an algorithm but ultimately decides which actions need to be taken.

3: In this letter, we develop an ``adherence-aware Q-learning" algorithm to address this problem. The algorithm learns the ``adherence level" that captures the frequency with which an HDM follows the recommended actions and derives the best recommendation policy in real time. We prove the convergence of the proposed Q-learning algorithm to the optimal value and evaluate its performance across various scenarios.

4: \end{abstract}

5: