4c58a30b2340f4c5.tex
1: \begin{abstract}
2: The novel \emph{Unbiased Online Recurrent Optimization}
3: (UORO) algorithm allows for online learning of general recurrent
4: computational graphs such as recurrent network models. It works in a
5: streaming fashion and avoids backtracking through past activations and
6: inputs. UORO is computationally as costly as \emph{Truncated
7: Backpropagation Through Time} (truncated BPTT), a widespread algorithm for online
8: learning of recurrent networks \cite{jaeger2002tutorial}. 
9: UORO is a modification of \emph{NoBackTrack} 
10: \cite{DBLP:journals/corr/OllivierC15} that bypasses the need for model
11: sparsity and makes implementation easy in current deep learning
12: frameworks, even for complex models.  
13: 
14: Like NoBackTrack,
15: UORO provides unbiased gradient estimates; unbiasedness is the
16: core hypothesis in 
17: stochastic
18: gradient descent theory, without which
19: convergence to a local optimum is not guaranteed. On the contrary,
20: truncated BPTT does not provide this property, leading
21: to possible divergence.
22: 
23: On synthetic tasks where truncated BPTT is shown to diverge, UORO converges. For
24:     instance, when a parameter has a positive short-term but negative long-term
25:     influence, truncated BPTT diverges unless the truncation span is very
26:     significantly longer than the intrinsic temporal range of the
27:     interactions, while UORO performs well thanks to the unbiasedness of its
28:     gradients.
29: \end{abstract}
30: