3228c48c00edd3f2.tex
1: \begin{abstract}
2: We describe the Median $K$-flats (MKF) algorithm, a simple online
3: method for hybrid linear modeling, i.e., for approximating data by a
4: mixture of flats. This algorithm simultaneously partitions the data
5: into clusters while finding their corresponding best approximating
6: $\ell_1$ $d$-flats, so that the cumulative $\ell_1$ error is
7: minimized. The current implementation restricts $d$-flats to be
8: $d$-dimensional linear subspaces. It requires a negligible amount of
9: storage, and its complexity, when modeling data consisting of $N$
10: points in $\reals^D$ with $K$ $d$-dimensional linear subspaces, is
11: of order $O(n_s \cdot K \cdot d \cdot D+n_s \cdot d^2 \cdot D)$,
12: where $n_s$ is the number of iterations required for convergence
13: (empirically on the order of $10^4$). Since it is an online
14: algorithm, data can be supplied to it incrementally and it can
15: incrementally produce the corresponding output. The performance of
16: the algorithm is carefully evaluated using synthetic and real data.
17: \vspace{.1in}
18: \end{abstract}
19: