abstract:0a40badf50c52493.tex

1: \begin{abstract}

2: A line of recent work has analyzed the behavior of the

3: Expectation-Maximization (EM) algorithm in the well-specified setting,

4: in which the population likelihood is locally strongly concave around

5: its maximizing argument.  Examples include suitably separated Gaussian

6: mixture models and mixtures of linear regressions.  We consider

7: over-specified settings in which the number of fitted components is

8: larger than the number of components in the true distribution. Such

9: mis-specified settings can lead to singularity in the Fisher

10: information matrix, and moreover, the maximum likelihood estimator

11: based on $n$ i.i.d. samples in $d$ dimensions can have a non-standard

12: $\mathcal{O}((d/n)^{\frac{1}{4}})$ rate of convergence.  Focusing on

13: the simple setting of two-component mixtures fit to a $d$-dimensional

14: Gaussian distribution, we study the behavior of the EM algorithm both

15: when the mixture weights are different (unbalanced case), and are

16: equal (balanced case). Our analysis reveals a sharp distinction

17: between these two cases: in the former, the EM algorithm converges

18: geometrically to a point at Euclidean distance of

19: $\mathcal{O}((d/n)^{\frac{1}{2}})$ from the true parameter, whereas in

20: the latter case, the convergence rate is exponentially slower, and the

21: fixed point has a much lower $\mathcal{O}((d/n)^{\frac{1}{4}})$

22: accuracy.  Analysis of this singular case requires the introduction of

23: some novel techniques: in particular, we make use of a careful form of

24: localization in the associated empirical process, and develop a

25: recursive argument to progressively sharpen the statistical rate.

26: \end{abstract}

27: