a404acd496286f9d.tex
1: \begin{abstract}
2: %	Stochastic optimisation has become a key field within machine learning, not at least due to the rising interest in deep learning problems.  
3: %	Pure gradient descent methods have served as a state-of-the-art to that end, which to a large extent can be explained by their good scaling capabilities.
4: 	%However, there has 
5: 	During recent years there has been an increased interest in stochastic adaptations of limited memory quasi-Newton methods, which compared to pure gradient-based routines can improve the convergence by incorporating second order information. 
6: 	In this work we propose a direct least-squares approach conceptually similar to the limited memory quasi-Newton methods, but that computes the search direction in a slightly different way.
7:         This is achieved in a fast and numerically robust manner by
8:         maintaining a Cholesky factor of low dimension.
9: 	This is combined with a stochastic line search relying upon fulfilment of the Wolfe condition in a backtracking manner, where the step length is adaptively modified with respect to the optimisation progress.
10: 	We support our new algorithm by providing several theoretical results guaranteeing its performance.
11: 	The performance is demonstrated on real-world benchmark problems which shows improved results in comparison with already established methods.
12: \end{abstract}