ad169a4a1b5c72c4.tex
1: \begin{abstract}
2: Training deep neural networks is a challenging task. In order to speed up training and 
3: enhance the performance of deep neural networks, we rectify the vanilla conjugate 
4: gradient as conjugate-gradient-like and incorporate it into the generic Adam, and thus
5: propose a new optimization algorithm named CG-like-Adam for deep learning. 
6: Specifically, both the first-order and the second-order moment estimation of generic Adam
7: are replaced by the conjugate-gradient-like. Convergence analysis handles the cases 
8: where the exponential moving average coefficient of the 
9: first-order moment estimation is constant and the first-order moment estimation is unbiased. 
10: Numerical experiments show the superiority of the proposed algorithm based on 
11: the CIFAR10/100 dataset.
12: \end{abstract}
13: