099fac027107dc62.tex
1: \begin{abstract}
2: Large tensor learning algorithms are usually computationally expensive and require storing an exceedingly large amount of data.  In this paper,  we propose a unified online Riemannian gradient descent (oRGrad) algorithm for  tensor learning,  which is computationally fast,  consumes much less memory,  and can deal with sequentially arrived data and make timely prediction.  The algorithm is applicable to both linear and generalized linear models.  We observe an intriguing {\it trade-off}  between the computational convergence rate and statistical rate of oRGrad.  Increasing the step size accelerates computational convergence but leads to larger statistical error, whereas a smaller step size yields smaller statistical error at the expense of slower computational convergence.  If the time horizon $T$ is known,  oRGrad achieves statistical optimality by choosing an appropriate fixed step size.  Besides the aforementioned benefits,  we discover that noisy tensor completion is particularly blessed by online algorithms in that it avoids the trimming procedure and guarantees sharp {\it entry-wise} statistical error,  which is usually deemed technically challenging for offline methods.   Online algorithms render powerful martingale tools applicable for theoretical analysis.   The regret of oRGrad is investigated and a fascinating \textit{trilemma} concerning the computational convergence rate,  statistical error,  and regret is observed.  By choosing an appropriate constant step size,  oRGrad achieves an $O(T^{1/2})$ regret.   We then introduce the {\it adaptive}-oRGrad algorithm, which can achieve the optimal $O(\log T)$ regret by adaptively selecting the step sizes, regardless of whether the time horizon is known or unknown.  The adaptive-oRGrad algorithm can also attain statistically optimal error without knowing the horizon.  Comprehensive numerical simulation results corroborate our theoretical findings. 
3: \end{abstract}
4: