2f87edc3c73c2334.tex
1: \begin{abstract}
2: A new variant of Newton's method for empirical risk minimization is studied, where at each iteration of the optimization algorithm, the gradient and Hessian of the objective function are replaced by robust estimators taken from existing literature on robust mean estimation for multivariate data. After proving a general theorem about the convergence of successive iterates to a small ball around the population-level minimizer, consequences of the theory in generalized linear models are studied when data are generated from Huber's epsilon-contamination model and/or heavy-tailed distributions. An algorithm for obtaining robust Newton directions based on the conjugate gradient method is also proposed, which may be more appropriate for high-dimensional settings, and conjectures about the convergence of the resulting algorithm are offered. Compared to robust gradient descent, the proposed algorithm enjoys the faster rates of convergence for successive iterates often achieved by second-order algorithms for convex problems, i.e., quadratic convergence in a neighborhood of the optimum, with a stepsize that may be chosen adaptively via backtracking linesearch.
3: %We provide a new computationally-efficient algorithm that finds estimators for the risk minimization problem. We show that these estimators are robust for general statistical models, under the robustness setting for the classical Huber $\epsilon$- contamination model. Our workhorse is a novel robust variant of, Newton’s method and we provide conditions under which our newton’s method variant provides accurate estimators in a general convex risk minimization problem. We provide specific consequences of our theory for linear regression. Finally, we study the empirical performance of our proposed methods on synthetic datasets, and find that our methods convincingly outperform a variety of baselines.
4: \end{abstract}
5: