abstract:e947a4f5442c18b0.tex

1: \begin{abstract}

2: Recent studies have demonstrated a perceivable improvement on the performance of neural machine translation by applying cross-lingual language model pretraining \cite{Lample2019CrosslingualLM}, especially the Translation Language Modeling (TLM).

3: To alleviate the need for expensive

4: parallel corpora by TLM, in this work, we incorporate the translation information from dictionaries into the pretraining process and propose a novel Bilingual Dictionary-based Language Model (BDLM).

5: % We evaluate our BDLM on WMT-News'19 Zh-En \cite{tiedemann2012parallel}, WMT'20 news-commentary Zh-En, and WMT'16 Ro-En.

6: We evaluate our BDLM in Chinese, English, and Romanian.

7: % Pretrained on these combined corpora, the BDLM is then fine-tuned for NMT.

8: For Chinese-English, we obtained a 55.0 BLEU on WMT-News'19 \cite{tiedemann2012parallel} and a 24.3 BLEU on WMT'20 news-commentary, outperforming the Vanilla Transformer \cite{vaswani2017attention} by more than 8.4 BLEU and 2.3 BLEU, respectively.

9: According to our results, the BDLM also has advantages on convergence speed and predicting rare words.

10: The increase in BLEU for WMT'16 Romanian-English also shows its effectiveness in low-resources language translation.

11: % Our code and pretrained models will be publicly available.

12: \end{abstract}

13: