cb5630a4b07a0a50.tex
1: \begin{abstract}
2: We study a fundamental class of regression models called the second
3: order linear model (SLM). The SLM extends the linear model to high
4: order functional space and has attracted considerable research interest
5: recently. Yet how to efficiently learn the SLM under full generality
6: using nonconvex solver still remains an open question due to several
7: fundamental limitations of the conventional gradient descent learning
8: framework. In this study, we try to attack this problem from a gradient-free
9: approach which we call the moment-estimation-sequence (MES) method.
10: We show that the conventional gradient descent heuristic is biased
11: by the skewness of the distribution therefore is no longer the best
12: practice of learning the SLM. Based on the MES framework, we design
13: a nonconvex alternating iteration process to train a $d$-dimension
14: rank-$k$ SLM within $O(kd)$ memory and one-pass of the dataset.
15: The proposed method converges globally and linearly, achieves $\epsilon$
16: recovery error after retrieving $O[k^{2}d\cdot\mathrm{polylog}(kd/\epsilon)]$
17: samples. Furthermore, our theoretical analysis reveals that not all
18: SLMs can be learned on every sub-gaussian distribution. When the instances
19: are sampled from a so-called $\tau$-MIP distribution, the SLM can
20: be learned by $O(p/\tau^{2})$ samples where $p$ and $\tau$ are
21: positive constants depending on the skewness and kurtosis of the distribution.
22: For non-MIP distribution, an addition diagonal-free oracle is necessary
23: and sufficient to guarantee the learnability of the SLM. Numerical
24: simulations verify the sharpness of our bounds on the sampling complexity
25: and the linear convergence rate of our algorithm.
26: \end{abstract}
27: