944403a951837078.tex
1: \begin{abstract}
2: The majority of machine learning methods can be regarded as the minimization of an unavailable risk function. To optimize the latter, with samples provided in a streaming fashion at hand, we define a general stochastic Newton algorithm and its weighted average versions. 
3: In several use cases, both implementations will be shown not to require the inversion of a Hessian estimate at each iteration, but a direct update of the estimate of the inverse Hessian instead will be favored. 
4: This generalizes a trick introduced in \cite{BGBP2019} for the specific case of logistic regression, and results in a cost of $O(d^2)$ operations per iteration, for $d$ the ambient dimension.
5: Under mild assumptions such as local strong convexity at the optimum, we establish almost sure convergences and rates of convergence of the algorithms, as well as central limit theorems for the constructed parameter estimates.  The unified framework considered in this paper covers the case of linear, logistic or softmax regressions to name a few. 
6: Numerical experiments on simulated and real data give the empirical evidence of the pertinence of the proposed methods, which outperform popular competitors particularly in case of bad initializations. 
7: \end{abstract}
8: