abstract:e31eb894f58730ec.tex

1: \begin{abstract}

2: The connectionist temporal classification (CTC) enables end-to-end sequence learning by maximizing the

3: probability of correctly recognizing sequences during training. The outputs of a CTC-trained model tend

4: to form a series of spikes separated by strongly predicted blanks, know as the spiky problem. To figure

5: out the reason for it, we reinterpret the CTC training process as an iterative fitting task that is based on

6: frame-wise cross-entropy loss. It offers us an intuitive way to compare target probabilities with model

7: outputs for each iteration, and explain how the model outputs gradually turns spiky. Inspired by it, we

8: put forward two ways to modify the CTC training. The experiments demonstrate that our method can

9: well solve the spiky problem and moreover, lead to faster convergence over various training settings.

10: Beside this, the reinterpretation of CTC, as a brand new perspective, may be potentially useful in other

11: situations. The code is publicly available at https://github.com/hzli-ucas/caffe/tree/ctc.

12: \end{abstract}

13: