9562442ec601e932.tex
1: \begin{abstract}
2: The learning dynamics of on-line independent component analysis is
3: analysed in the limit of large data dimension. We study a simple
4: Hebbian learning algorithm that can be used to separate out a
5: small number of non-Gaussian components from a high-dimensional
6: data set. The de-mixing matrix parameters are confined to a
7: Stiefel manifold of tall, orthogonal matrices and we introduce a
8: natural gradient variant of the algorithm which is appropriate to
9: learning on this manifold. For large input dimension the parameter
10: trajectory of both algorithms passes through a sequence of
11: unstable fixed points, each described by a diffusion process in a
12: polynomial potential. Choosing the learning rate too large
13: increases the escape time from each of these fixed points,
14: effectively trapping the learning in a sub-optimal state.  In
15: order to avoid these trapping states a very low learning rate must
16: be chosen during the learning transient, resulting in learning
17: time-scales of $O(N^2)$ or $O(N^3)$ iterations where $N$ is the
18: data dimension. Escape from each sub-optimal state results in a
19: sequence of symmetry breaking events as the algorithm learns each
20: source in turn. This is in marked contrast to the learning
21: dynamics displayed by related on-line learning algorithms for
22: multilayer neural networks and principal component analysis.
23: Although the natural gradient variant of the algorithm has nice
24: asymptotic convergence properties, it has an equivalent transient
25: dynamics to the standard Hebbian algorithm.
26: \end{abstract}
27: