1: \begin{abstract}
2: % Recurrent Neural Network~(RNN) has made great progress in various fields recently.
3: % However,
4: It is hard to train Recurrent Neural Network~(RNN) with stable convergence and avoid gradient \textit{vanishing} and \textit{exploding}, as the weights in the recurrent unit are repeated from iteration to iteration.
5: Moreover, RNN is sensitive to the initialization of weights and bias, which brings difficulty in the training phase.
6: With the gradient-free feature and immunity to poor conditions, the Alternating Direction Method of Multipliers~(ADMM) has become a promising algorithm to train neural networks beyond traditional stochastic gradient algorithms.
7: However, ADMM could not be applied to train RNN directly since the state in the recurrent unit is repetitively updated over timesteps.
8: Therefore, this work builds a new framework named ADMMiRNN upon the unfolded form of RNN to address the above challenges simultaneously and provides novel update rules and theoretical convergence analysis.
9: We explicitly specify key update rules in the iterations of ADMMiRNN with deliberately constructed approximation techniques and solutions to each sub-problem instead of vanilla ADMM.
10: Numerical experiments are conducted on MNIST and text classification tasks, where ADMMiRNN achieves convergent results and outperforms compared baselines.
11: Furthermore, ADMMiRNN trains RNN in a more stable way without gradient vanishing or exploding compared to the stochastic gradient algorithms.
12: % than them, %
13: Source code has been available at \href{https://github.com/TonyTangYu/ADMMiRNN}{https://github.com/TonyTangYu/ADMMiRNN}.
14:
15:
16: %\keywords{RNN \and ADMM\and Convergence\and Stability.}
17:
18: \end{abstract}
19: