1: \begin{abstract}
2: \begin{quote}
3: We shed new insights on the two commonly used updates for
4: the online $k$-PCA problem, namely, Krasulina's and
5: Oja's updates. We show that Krasulina's update
6: corresponds to a projected gradient descent step on the
7: Stiefel manifold of the orthonormal $k$-frames,
8: while Oja's update amounts to a gradient descent step using the unprojected gradient.
9: Following these observations, we derive a more
10: \emph{implicit} form of Krasulina's $k$-PCA
11: update, i.e. a version that uses the information of the
12: future gradient as much as possible. Most
13: interestingly, our implicit Krasulina
14: update avoids the costly QR-decomposition step
15: by bypassing the orthonormality constraint. We show
16: that the new update in fact corresponds to an online EM
17: step applied to a probabilistic $k$-PCA model. The probabilistic view of the
18: updates allows us to combine multiple models in a
19: distributed setting. We show experimentally
20: that the implicit Krasulina update yields
21: superior convergence while being significantly faster.
22: We also give strong evidence that the new update
23: can benefit from parallelism and is more stable
24: w.r.t. tuning of the learning rate.
25: \end{quote}
26: \end{abstract}
27: