7048fee0448798b2.tex
1: \begin{abstract}%
2: 
3:   %\nbyp{Wordsmith this}
4:   We consider the question of estimating multi-dimensional Gaussian mixtures (GM) with compactly
5:   supported or subgaussian mixing distributions. Minimax estimation rate for this class (under
6:   Hellinger, TV and KL divergences) is a long-standing open question, even in one dimension. In this paper we
7:   characterize this rate (for all constant dimensions) in terms of the metric entropy of the class. Such
8:   characterizations originate from seminal works of \cite{lecam1973convergence,
9:   birge1983approximation,haussler1997mutual,yang1999information}.
10:   However, for GMs a key ingredient missing from earlier work (and widely sought-after) is a comparison result showing that the KL and the squared Hellinger distance are within a constant multiple of each other uniformly over the
11:   class. Our main technical contribution is in showing this fact, from which
12:   we derive entropy characterization for estimation rate under Hellinger and
13:   KL. Interestingly, the sequential (online learning) estimation rate is characterized by the
14:   global entropy, while the single-step (batch) rate corresponds to local entropy, paralleling a
15:   similar result for the Gaussian sequence model recently discovered by~\cite{neykov2022minimax} and \cite{jaonad2023coding}. Additionally, since Hellinger is a proper metric, our comparison shows that GMs under KL satisfy the triangle inequality within multiplicative constants, implying that proper and improper estimation
16:   rates coincide.
17: \end{abstract}
18: