8cf3dda0a5df92b4.tex
1: \begin{abstract}
2:  
3: Gradient boosting performs exceptionally in most prediction problems and scales well to large datasets. 
4: In this paper we prove that a ``lassoed'' gradient boosted tree algorithm with early stopping achieves faster than $n^{-1/4}$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation. 
5: This rate is remarkable because it does not depend on the dimension, sparsity, or smoothness.
6: We use simulation and real data to confirm our theory and demonstrate empirical performance and scalability on par with standard boosting.
7: Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes. 
8: \end{abstract}
9: