1: \begin{abstract}
2: Recent work across many machine learning disciplines
3: has
4: highlighted that standard descent methods, even without explicit
5: regularization, do not merely minimize the training error, but also
6: exhibit an \emph{implicit bias}.
7: This bias is typically towards a certain regularized solution, and
8: relies upon the details of the learning process,
9: for instance the use of the cross-entropy loss.
10:
11: In this work, we show that for empirical risk minimization over linear
12: predictors with \emph{arbitrary} convex, strictly decreasing losses, if the
13: risk does not attain its infimum, then the gradient-descent path and the
14: \emph{algorithm-independent} regularization path converge to the same
15: direction (whenever either converges to a direction).
16: Using this result, we provide a justification for the widely-used
17: exponentially-tailed losses (such as the exponential loss or the logistic
18: loss):
19: while this convergence to a direction for exponentially-tailed losses is
20: necessarily to the maximum-margin direction, other losses such as
21: polynomially-tailed losses may induce convergence to a direction
22: with a poor margin.
23: \end{abstract}
24: