1: \begin{abstract}%
2:
3: %\nbyp{Wordsmith this}
4: We consider the question of estimating multi-dimensional Gaussian mixtures (GM) with compactly
5: supported or subgaussian mixing distributions. Minimax estimation rate for this class (under
6: Hellinger, TV and KL divergences) is a long-standing open question, even in one dimension. In this paper we
7: characterize this rate (for all constant dimensions) in terms of the metric entropy of the class. Such
8: characterizations originate from seminal works of \cite{lecam1973convergence,
9: birge1983approximation,haussler1997mutual,yang1999information}.
10: However, for GMs a key ingredient missing from earlier work (and widely sought-after) is a comparison result showing that the KL and the squared Hellinger distance are within a constant multiple of each other uniformly over the
11: class. Our main technical contribution is in showing this fact, from which
12: we derive entropy characterization for estimation rate under Hellinger and
13: KL. Interestingly, the sequential (online learning) estimation rate is characterized by the
14: global entropy, while the single-step (batch) rate corresponds to local entropy, paralleling a
15: similar result for the Gaussian sequence model recently discovered by~\cite{neykov2022minimax} and \cite{jaonad2023coding}. Additionally, since Hellinger is a proper metric, our comparison shows that GMs under KL satisfy the triangle inequality within multiplicative constants, implying that proper and improper estimation
16: rates coincide.
17: \end{abstract}
18: