a7c3f568e26b4392.tex
1: \begin{abstract}
2: We consider a finite mixture of regressions (FMR) model for high-dimensional
3: inhomogeneous data where the number of covariates may be much
4: larger than sample size. We propose an $\ell_1$-penalized maximum likelihood
5: estimator in an appropriate parameterization. This kind of estimation
6: belongs to a class of problems where optimization and theory for
7: non-convex functions is needed. This distinguishes itself very clearly from
8: high-dimensional estimation with convex loss- or objective functions, as
9: for example with the Lasso in linear or generalized linear models. Mixture
10: models represent a prime and important example where non-convexity arises.  
11: 
12: For FMR models, we develop an efficient
13: EM algorithm for numerical optimization with provable convergence
14: properties. Our penalized estimator is
15: numerically better posed (e.g., boundedness of the 
16: criterion function) than unpenalized maximum likelihood estimation, and it
17: allows for effective statistical regularization including variable
18: selection. We also present some asymptotic theory and oracle inequalities:
19: due to non-convexity of the negative log-likelihood function, different
20: mathematical arguments are needed than for problems with convex
21: losses. Finally, we apply 
22: the new method to both simulated and real data. \vspace{0.5cm}\\
23: {\bf Keywords} {Adaptive Lasso, Finite mixture models, Generalized EM algorithm, High-dimensional estimation, Lasso, Oracle inequality}
24: \vspace{0.5cm}\\
25: {\bf This is the author’s version of the work (published as a discussion paper in TEST,  2010, Volume 19,  209­-285). The final publication is available at www.springerlink.com.} 
26: \end{abstract}
27: