abstract:31bde00dece78880.tex

1: \begin{abstract}

2:   \begin{quote}

3:   	Deep reinforcement learning (deep RL) has achieved superior performance in complex sequential task by learning directly from image input. A deep neural network is used as a function approximator and require no specific states information. However, one drawback of using only images as input is it requires the model to learn the state feature representation with a prohibitively large amount of training time and data to reach a reasonable performance. This is not feasible in real-world applications, especially when the data is expansive and training phase could introduce disasters that affect human safety. In this work,we use a human demonstration approach to speed up the training for learning features. And using the pre-trained model to replace the neural network in the deep RL Deep Q-Network (DQN), then applying human interaction to further refine the model. We empirically evaluate our approach by using human demonstration model only and modified DQN with human demonstration model included on the Microsoft AirSim car simulator. Our results show that: (1) pre-training with human demonstration in a supervised learning manner is better at discovering features and much faster than DQN only, (2) initializing the DQN with a pre-trained model provides a significant improvement in training time and performance even with limited human demonstration, and (3) providing the ability for humans to supply suggestions during DQN training can speed up the network's convergence on an optimal policy as well as learn more complex policies which are harder to discover by random exploration.

4:   \end{quote}

5: \end{abstract}

6: