abstract:a7c3f568e26b4392.tex

1: \begin{abstract}

2: We consider a finite mixture of regressions (FMR) model for high-dimensional

3: inhomogeneous data where the number of covariates may be much

4: larger than sample size. We propose an $\ell_1$-penalized maximum likelihood

5: estimator in an appropriate parameterization. This kind of estimation

6: belongs to a class of problems where optimization and theory for

7: non-convex functions is needed. This distinguishes itself very clearly from

8: high-dimensional estimation with convex loss- or objective functions, as

9: for example with the Lasso in linear or generalized linear models. Mixture

10: models represent a prime and important example where non-convexity arises.

11:

12: For FMR models, we develop an efficient

13: EM algorithm for numerical optimization with provable convergence

14: properties. Our penalized estimator is

15: numerically better posed (e.g., boundedness of the

16: criterion function) than unpenalized maximum likelihood estimation, and it

17: allows for effective statistical regularization including variable

18: selection. We also present some asymptotic theory and oracle inequalities:

19: due to non-convexity of the negative log-likelihood function, different

20: mathematical arguments are needed than for problems with convex

21: losses. Finally, we apply

22: the new method to both simulated and real data. \vspace{0.5cm}\\

23: {\bf Keywords} {Adaptive Lasso, Finite mixture models, Generalized EM algorithm, High-dimensional estimation, Lasso, Oracle inequality}

24: \vspace{0.5cm}\\

25: {\bf This is the author’s version of the work (published as a discussion paper in TEST,  2010, Volume 19,  209-285). The final publication is available at www.springerlink.com.}

26: \end{abstract}

27: