1: \begin{abstract}
2: We propose a novel online learning paradigm
3: for nonlinear-function estimation tasks
4: based on the iterative projections in the $L^2$ space
5: with probability measure reflecting the stochastic property of input signals.
6: The proposed learning algorithm exploits the reproducing kernel of the
7: so-called dictionary subspace,
8: based on the fact that any finite-dimensional space of functions has a
9: reproducing kernel characterized by the Gram matrix.
10: The $L^2$-space geometry provides the best decorrelation property in principle.
11: The proposed learning paradigm is
12: significantly different
13: from the conventional kernel-based learning paradigm
14: in two senses: (i) the whole space is {\em not} a reproducing
15: kernel Hilbert space and (ii) the minimum mean squared error estimator
16: gives the best approximation of the desired nonlinear function in the
17: dictionary subspace.
18: It preserves efficiency in computing the inner product as well as
19: in updating the Gram matrix when the dictionary grows.
20: Monotone approximation,
21: asymptotic optimality, and convergence of the proposed algorithm are analyzed
22: based on the variable-metric version of adaptive projected subgradient method.
23: Numerical examples show the efficacy of the proposed algorithm for real data
24: over a variety of methods including
25: the extended Kalman filter and many batch machine-learning methods such as the multilayer perceptron.
26: \end{abstract}
27: