03b66df85c5e9568.tex
1: \begin{abstract}
2:   A fundamental challenge in machine learning is the choice of a loss as it characterizes our learning task, is minimized in the training phase, and serves as an evaluation criterion for estimators.
3:   Proper losses are commonly chosen, ensuring minimizers of the full risk match the true probability vector.
4:   Estimators induced from a proper loss are widely used to construct forecasters for downstream tasks such as classification and ranking.
5:   In this procedure, how does the forecaster based on the obtained estimator perform well under a given downstream task?
6:   This question is substantially relevant to the behavior of the $p$-norm between the estimated and true probability vectors when the estimator is updated.
7:   In the proper loss framework, the suboptimality of the estimated probability vector from the true probability vector is measured by a surrogate regret.
8:   First, we analyze a surrogate regret and show that the \emph{strict} properness of a loss is necessary and sufficient to establish a non-vacuous surrogate regret bound.
9:   Second, we solve an important open question that the order of convergence in $p$-norm cannot be faster than the $1/2$-order of surrogate regrets for a broad class of strictly proper losses.
10:   This implies that strongly proper losses entail the optimal convergence rate.
11: \end{abstract}
12: