1: \begin{abstract}
2:
3: This paper focuses on the problem of detecting and reacting to changes in the distribution of a sensorimotor controller's observables.
4: The key idea is the design of switching policies that can take conformal quantiles as input, which we define as
5: \emph{conformal policy learning}, that allows robots to detect distribution shifts with formal statistical guarantees.
6: %
7: We show how to design such policies by using conformal quantiles to switch between base policies with different characteristics, e.g. safety or speed, or directly augmenting a policy observation with a quantile and training it with reinforcement learning.
8: % (the latter does not work as well as the former in our experiments, despite its conceptual appeal)
9: %Despite the latter being novel and an interesting form of learning under uncertainty, its practical performance is not as good as the switching policy, .
10: %
11: Theoretically, we show that such policies achieve the formal convergence guarantees in finite time.
12: In addition, we thoroughly evaluate their advantages and limitations on two compelling use cases: simulated autonomous driving and active perception with a physical quadruped.
13: %
14: Empirical results demonstrate that our approach outperforms five baselines.
15: %
16: It is also the simplest of the baseline strategies besides one ablation.
17: %
18: Being easy to use, flexible, and with formal guarantees, our work demonstrates how conformal prediction can be an effective tool for sensorimotor learning under uncertainty.
19:
20: \end{abstract}
21: