abstract:8cf3dda0a5df92b4.tex

1: \begin{abstract}

2:

3: Gradient boosting performs exceptionally in most prediction problems and scales well to large datasets.

4: In this paper we prove that a ``lassoed'' gradient boosted tree algorithm with early stopping achieves faster than $n^{-1/4}$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation.

5: This rate is remarkable because it does not depend on the dimension, sparsity, or smoothness.

6: We use simulation and real data to confirm our theory and demonstrate empirical performance and scalability on par with standard boosting.

7: Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.

8: \end{abstract}

9: