1: \begin{abstract}
2: % Stochastic optimisation has become a key field within machine learning, not at least due to the rising interest in deep learning problems.
3: % Pure gradient descent methods have served as a state-of-the-art to that end, which to a large extent can be explained by their good scaling capabilities.
4: %However, there has
5: During recent years there has been an increased interest in stochastic adaptations of limited memory quasi-Newton methods, which compared to pure gradient-based routines can improve the convergence by incorporating second order information.
6: In this work we propose a direct least-squares approach conceptually similar to the limited memory quasi-Newton methods, but that computes the search direction in a slightly different way.
7: This is achieved in a fast and numerically robust manner by
8: maintaining a Cholesky factor of low dimension.
9: This is combined with a stochastic line search relying upon fulfilment of the Wolfe condition in a backtracking manner, where the step length is adaptively modified with respect to the optimisation progress.
10: We support our new algorithm by providing several theoretical results guaranteeing its performance.
11: The performance is demonstrated on real-world benchmark problems which shows improved results in comparison with already established methods.
12: \end{abstract}