4c330951b263f2a9.tex
1: \begin{abstract}
2: In this paper, we consider the problem of stochastic optimization under a bandit feedback model.
3: We generalize the \textsc{GP-UCB} algorithm [Srinivas and al., 2012] to arbitrary kernels and search spaces.
4: To do so, we use a notion of localized chaining to control the supremum of a Gaussian process,
5: and provide a novel optimization scheme based on the computation of covering numbers.
6: The theoretical bounds we obtain on the cumulative regret are more generic
7: and present the same convergence rates as the \textsc{GP-UCB} algorithm.
8: Finally, the algorithm is shown to be empirically more efficient than its natural competitors
9: on simple and complex input spaces.
10: \end{abstract}
11: