1: \begin{abstract}
2: We present new excess risk bounds for
3: general unbounded loss functions including log loss and squared
4: loss, where the distribution of the losses may be heavy-tailed. The
5: bounds hold for general estimators, but they are optimized when applied
6: to $\eta$-generalized Bayesian, MDL, and ERM estimators. When applied
7: with log loss, the bounds imply convergence rates for generalized
8: Bayesian inference under misspecification in terms of a
9: generalization of the Hellinger metric as long as the {\em learning
10: rate\/} $\eta$ is set correctly. For general loss functions, our
11: bounds rely on two separate conditions: the $v$-GRIP \emph{(generalized reversed information projection)} conditions, which control
12: the lower tail of the excess loss; and the newly introduced
13: \emph{witness condition}, which controls the upper
14: tail. The parameter $v$ in the $v$-GRIP conditions determines the
15: achievable rate and is akin to the exponent in the well-known Tsybakov margin
16: condition and the Bernstein condition for bounded losses, which the
17: $v$-GRIP conditions generalize; favorable $v$ in combination with
18: small model complexity leads to $\tilde{O}(1/n)$ rates. The
19: witness condition allows us to connect the excess risk to an
20: `annealed' version thereof, by which we generalize several previous
21: results connecting Hellinger and R\'enyi divergence to KL
22: divergence.
23: \end{abstract}
24: