abstract:7ea67f8e84b897d6.tex

1: \begin{abstract}

2: We study the convergence of the predictive surface of regression trees and forests.

3: To support our analysis we introduce a notion of adaptive concentration for regression trees.

4: This approach breaks tree training into a model selection phase in which we pick the tree splits, followed by a model fitting phase where we find the best regression model consistent with these splits.

5: We then show that the fitted regression tree concentrates around the optimal predictor with the same splits:

6: as $d$ and $n$ get large, the discrepancy is with high probability bounded on the order of $\sqrt{\log(d)\log(n)/k}$ uniformly over the whole regression surface, where $d$ is the dimension of the feature space, $n$ is the number of training examples, and $k$ is the minimum leaf size for each tree.

7: We also provide rate-matching lower bounds for this adaptive concentration statement.

8: From a practical perspective, our result enables us to prove consistency results for adaptively grown forests in high dimensions,

9: and to carry out valid post-selection inference in the sense of Berk et al. [2013] for subgroups defined by tree leaves.

10: \end{abstract}

11: