abstract:3f19fac6c688a004.tex

1: \begin{abstract}

2: Generative adversarial imitation learning (GAIL) has attracted increasing attention in the field of robot learning.

3: It enables robots to learn a policy to achieve a task demonstrated by an expert while simultaneously estimating the reward function behind the expert's behaviors.

4: However, this framework is limited to learning a single task with a single reward function.

5: This study proposes an extended framework called situated GAIL (S-GAIL), in which a task variable is introduced to both the discriminator and generator of the GAIL framework.

6: The task variable has the roles of discriminating different contexts and making the framework learn different reward functions and policies for multiple tasks.

7: To achieve the early convergence of learning and robustness during reward estimation, we introduce a term to adjust the entropy regularization coefficient in the generator's objective function.

8: Our experiments using two setups (navigation in a discrete grid world and arm reaching in a continuous space) demonstrate that the proposed framework can acquire multiple reward functions and policies more effectively than existing frameworks.

9: The task variable enables our framework to differentiate contexts while sharing common knowledge among multiple tasks.

10: %The code for reproducing the experiments are available at https://github.com/kyoichiro/SituatedGAIL.

11:

12: \begin{keywords}

13:   imitation learning; generative adversarial imitation learning; inverse reinforcement learning; reinforcement learning; reward function

14: \end{keywords}\medskip

15:

16: \end{abstract}

17: