abstract:a40dfcbc30e7fec0.tex

1: \begin{abstract}

2: Adaptive methods are extremely popular in machine learning as they make learning rate tuning less expensive. This paper introduces a novel optimization algorithm named \algname{KATE}, which presents a scale-invariant adaptation of the well-known \algname{AdaGrad} algorithm. We prove the scale-invariance of \algname{KATE} for the case of Generalized Linear Models. Moreover, for general smooth non-convex problems, we establish a convergence rate of  $\cO(\nicefrac{\log T}{\sqrt{T}})$ for \algname{KATE}, matching the best-known ones for \algname{AdaGrad} and \algname{Adam}.

3: We also compare \algname{KATE} to other state-of-the-art adaptive algorithms \algname{Adam} and \algname{AdaGrad} in numerical experiments with different problems, including complex machine learning tasks like image classification and text classification on real data. The results indicate that \algname{KATE} consistently outperforms \algname{AdaGrad} and matches/surpasses the performance of \algname{Adam} in all considered scenarios.

4: \end{abstract}

5: