88ec514bb1892d7d.tex
1: \begin{abstract}
2: We present new excess risk bounds for
3:   general unbounded loss functions including log loss and squared
4:   loss, where the distribution of the losses may be heavy-tailed. The
5:   bounds hold for general estimators, but they are optimized when applied
6:   to $\eta$-generalized Bayesian, MDL, and ERM estimators. When applied
7:   with log loss, the bounds imply convergence rates for generalized
8:   Bayesian inference under misspecification in terms of a
9:   generalization of the Hellinger metric as long as the {\em learning
10:     rate\/} $\eta$ is set correctly. For general loss functions, our
11:   bounds rely on two separate conditions: the $v$-GRIP \emph{(generalized reversed information projection)} conditions, which control
12:   the lower tail of the excess loss; and the newly introduced
13:   \emph{witness condition}, which controls the upper
14:   tail. The parameter $v$ in the $v$-GRIP conditions determines the
15:   achievable rate and is akin to the exponent in the well-known Tsybakov margin
16:   condition and the Bernstein condition for bounded losses, which the
17:   $v$-GRIP conditions generalize; favorable $v$ in combination with
18:   small model complexity leads to $\tilde{O}(1/n)$ rates. The
19:   witness condition allows us to connect the excess risk to an
20:   `annealed' version thereof, by which we generalize several previous
21:   results connecting Hellinger and R\'enyi divergence to KL
22:   divergence.
23: \end{abstract}
24: