d345fcec9221df39.tex
1: \begin{abstract}
2:     In min-min optimization or max-min optimization, one has to compute the gradient of a function defined as a minimum.
3:     %
4:     In most cases, the minimum has no closed-form, and an approximation is obtained via an iterative algorithm.
5:     %
6:     There are two usual ways of estimating the gradient of the function: using either an \emph{analytic} formula obtained by assuming exactness of the approximation, or \emph{automatic} differentiation through the algorithm.
7:     %
8:     In this paper, we study the asymptotic error made by these estimators as a function of the optimization error.
9:     %
10:     We find that the error of the automatic estimator is close to the square of the error of the analytic estimator, reflecting a \emph{super-efficiency} phenomenon.
11:     %
12:     The convergence of the automatic estimator greatly depends on the convergence of the Jacobian of the algorithm.
13:     %
14:     We analyze it for gradient descent and stochastic gradient descent and derive convergence rates for the estimators in these cases.
15:     %
16:     Our analysis is backed by numerical experiments on toy problems and on Wasserstein barycenter computation.
17:     %
18:     Finally, we discuss the computational complexity of these estimators and give practical guidelines to chose between them.
19: \end{abstract}
20: