abstract:bd18d447e9ba2429.tex

1: \begin{abstract}

2:

3:     One way to approach end-to-end autonomous driving is to learn a policy

4:     function that maps from a sensory input, such as an image frame from a

5:     front-facing camera, to a driving action, by imitating an expert driver, or

6:     a reference policy.  This can be done by supervised learning, where a policy

7:     function is tuned to minimize the difference between the predicted and

8:     ground-truth actions. A policy function trained in this way however is known

9:     to suffer from unexpected behaviours due to the mismatch between the states

10:     reachable by the reference policy and trained policy functions.  More

11:     advanced algorithms for imitation learning, such as DAgger, addresses this

12:     issue by iteratively collecting training examples from both reference and

13:     trained policies. These algorithms often requires a large number of queries

14:     to a reference policy, which is undesirable as the reference policy is often

15:     expensive. In this paper, we propose an extension of the DAgger, called

16:     SafeDAgger, that is query-efficient and more suitable for end-to-end

17:     autonomous driving. We evaluate the proposed SafeDAgger in a car racing

18:     simulator and show that it indeed requires less queries to a reference

19:     policy. We observe a significant speed up in convergence, which we

20:     conjecture to be due to the effect of automated curriculum learning.

21:

22: \end{abstract}

23: