1: \begin{abstract}
2: % We propose a new ensemble learning method that can handle a combination of an infinite number of classifiers in a rigorous way.
3: % Our method constructs the {\it infinite ensemble} by optimizing a probability measure on the set of classifiers.
4: % Through a theoretical analysis, we show that our method has superior generalization ability to finite ensembles as constructed in the existing method.
5: % Unlike existing ensemble methods, there is no need to impose a penalization and thus we are free from both handling the penalization term and adjusting early stopping timing because our method performs in a space of probability measures.
6: % Such an optimization is realized by proposing a general purpose stochastic optimization method for learning probability measures via parameterization using transport maps on base classifiers.
7: % We provide several theoretical tools that allow us to construct convergence analyses by analogy with the finite-dimensional case.
8: % Utilizing these tools, we give a convergence rate as fast as that of a stochastic optimization method for finite dimensional nonconvex problems.
9: % Moreover, we show an interior optimality property of a local optimality condition used in our analysis.
10: % \end{abstract}
11: