abstract:40f197f84426202a.tex

1: \begin{abstract}

2: \begin{quote}

3:   We shed new insights on the two commonly used updates for

4:     the online $k$-PCA problem, namely, Krasulina's and

5:     Oja's updates. We show that Krasulina's update

6:     corresponds to a projected gradient descent step on the

7:     Stiefel manifold of the orthonormal $k$-frames,

8:     while Oja's update amounts to a gradient descent step using the unprojected gradient.

9:     Following these observations, we derive a more

10:     \emph{implicit} form of Krasulina's $k$-PCA

11:     update, i.e. a version that uses the information of the

12:     future gradient as much as possible.  Most

13:     interestingly, our implicit Krasulina

14:     update avoids the costly QR-decomposition step

15:     by bypassing the orthonormality constraint. We show

16:     that the new update in fact corresponds to an online EM

17:     step applied to a probabilistic $k$-PCA model. The probabilistic view of the

18:     updates allows us to combine multiple models in a

19:     distributed setting. We show experimentally

20:     that the implicit Krasulina update yields

21:     superior convergence while being significantly faster.

22:     We also give strong evidence that the new update

23:     can benefit from parallelism and is more stable

24:     w.r.t. tuning of the learning rate.

25: \end{quote}

26: \end{abstract}

27: