abstract:6fd5527e5515d808.tex

1: \begin{abstract}

2: In second-order optimization, a potential bottleneck can be computing the

3: Hessian matrix of the optimized function at every iteration. Randomized

4: sketching has emerged as a powerful technique for constructing

5: estimates of the Hessian which can be used to perform approximate

6: Newton steps. This involves multiplication by a

7: random sketching matrix, which introduces a trade-off between the

8: computational cost of sketching and the convergence rate of the

9: optimization algorithm. A theoretically desirable but practically much too

10: expensive choice is to use a dense Gaussian sketching matrix, which

11: produces unbiased estimates of the exact Newton step and which offers strong

12: problem-independent convergence guarantees. We show that the Gaussian

13: sketching matrix can be drastically sparsified, significantly reducing the

14: computational cost of sketching, without substantially affecting its convergence

15: properties. This approach, called Newton-LESS, is based on a recently introduced

16: sketching technique: LEverage Score Sparsified (LESS)

17: embeddings. We prove that Newton-LESS enjoys nearly the same

18: problem-independent local convergence rate as Gaussian embeddings, not

19: just up to constant factors  but even down to lower order terms, for

20: a large class of optimization tasks. In

21: particular, this leads to a new state-of-the-art convergence result for

22: an iterative least squares solver. Finally,

23: we extend LESS embeddings to include uniformly sparsified random sign matrices

24: which can be implemented efficiently and which perform well in numerical experiments.

25: \end{abstract}

26: