abstract:7c6bba60f301f50b.tex

1: \begin{abstract}

2: Ensemble learning is a statistical paradigm

3: %in which multiple learners are trained to solve the same problem.

4: %Its success is

5: built on the premise that  many weak learners can perform exceptionally well when deployed collectively.

6: %These methods have had a decided impact on

7: %The success of ensemble learning can be attributed  to the fact that  many weak learners can perform exceptionally well when deployed collectively.

8: The BART method of \cite{chipman2010bart} is a prominent example of {\sl Bayesian} ensemble learning, where each learner is a tree.

9: Due to its impressive performance, BART has received a lot of attention from practitioners.

10: Despite its wide popularity, however, theoretical studies of BART have  begun emerging only very recently.

11: Laying the foundations for the theoretical analysis of Bayesian forests,  \cite{rockova2017posterior} showed optimal posterior concentration   under {\sl conditionally uniform tree priors.}

12: These priors  deviate from the actual priors implemented in BART. Here, we study the exact BART prior and propose a simple modification so that  it {\sl also} enjoys optimality properties.

13: To this end, we dive into branching process theory. We obtain  tail bounds for the distribution of total progeny under heterogeneous Galton-Watson (GW) processes exploiting their connection to random walks.

14: We conclude with a result stating the optimal rate of posterior convergence for BART.

15: \end{abstract}

16: