1: \begin{abstract}
2: To align mobile robot navigation policies with user preferences through reinforcement learning from human feedback (RLHF), reliable and behavior-diverse user queries are required.
3: However, deterministic policies fail to generate a variety of navigation trajectory suggestions for a given navigation task configuration.
4: We introduce EnQuery, a query generation approach using an ensemble of policies that achieve behavioral diversity through a regularization term.
5: For a given navigation task, EnQuery produces multiple navigation trajectory suggestions, thereby optimizing the efficiency of preference data collection with fewer queries.
6: Our methodology demonstrates superior performance in aligning navigation policies with user preferences in low-query regimes, offering enhanced policy convergence from sparse preference queries.
7: The evaluation is complemented with a novel explainability representation, capturing full scene navigation behavior of the mobile robot in a single plot.
8: \end{abstract}