8cd1c8385b1b8770.tex
1: \begin{abstract}
2: Reinforcement Learning (RL) based methods have been increasingly explored for robot learning. However, RL based methods often suffer
3: from low sampling efficiency in the exploration phase, especially for 
4: long-horizon manipulation tasks, and generally neglect the semantic
5: information from the task level, resulted in a delayed convergence
6: or even tasks failure. To tackle these challenges, we propose a Temporal-Logic-guided Hybrid
7: policy framework (HyTL) which leverages three-level decision layers to improve the agent's performance. Specifically, the task specifications are encoded via linear temporal logic (LTL) to improve performance and offer interpretability. And a waypoints planning module is designed with the feedback from
8: the LTL-encoded task level as a high-level policy to improve the exploration efficiency. The middle-level policy selects which behavior primitives
9: to execute, and the low-level policy specifies the corresponding parameters to interact with
10: the environment. We evaluate HyTL on four challenging manipulation
11: tasks, which demonstrate its effectiveness and interpretability.
12: Our project is available at: \href{https://sites.google.com/view/hytl-0257/}{https://sites.google.com/view/hytl-0257/}.
13: 
14: \global\long\def\prog{\operatorname{prog}}
15: \global\long\def\argmax{\operatorname{argmax}}
16: \global\long\def\argmin{\operatorname{argmin}}
17: \end{abstract}
18: