1: \begin{abstract}
2: Work on summarization has explored both reinforcement learning (RL)
3: optimization using ROUGE as a reward and syntax-aware models,
4: such as models whose input is enriched with
5: part-of-speech (POS)-tags and dependency information. However, it is not
6: clear what is the respective impact of these approaches beyond the standard ROUGE evaluation metric. Especially, RL-based for summarization is becoming more and more popular. In
7: this paper, we provide a detailed comparison of these two
8: approaches and of their combination along several dimensions that
9: relate to the perceived quality of the generated summaries: number of repeated words, distribution of part-of-speech tags, impact of sentence length, relevance and grammaticality.
10: %how many words are repeated in the output~? How close to the ground truth
11: %is the generated distribution of part-of-speech tags~? What is the impact
12: %of sentence length~? How good are relevance and grammaticality~?
13: Using the standard Gigaword sentence summarization task,
14: we compare an RL self-critical sequence training (SCST)
15: method with syntax-aware models that leverage POS tags and Dependency information.
16: We show that on all qualitative evaluations, the combined model gives
17: the best results, but also that only training with RL and without
18: any syntactic information already gives nearly as good results as
19: syntax-aware models with less parameters and faster training convergence.
20:
21: \end{abstract}
22: