1: \begin{abstract}
2: We consider the top-$k$ arm identification problem for multi-armed bandits with rewards belonging to a one-parameter canonical exponential family.
3: The objective is to select the set of $k$ arms with the highest mean rewards by sequential allocation of sampling efforts.
4: We propose a unified optimal allocation problem that identifies the complexity measures of this problem under the fixed-confidence, fixed-budget settings, and the posterior convergence rate from the Bayesian perspective.
5: We provide the first characterization of its optimality.
6: We provide the first provably optimal algorithm in the fixed-confidence setting for $k>1$.
7: We also propose an efficient heuristic algorithm for the top-$k$ identification problem.
8: Extensive numerical experiments demonstrate superior performance compare to existing methods in all three settings.
9: %We propose asymptotically optimal anytime and parameter-free algorithms based on the analysis of the optimal allocation problem.
10: %Numerical experiments demonstrate superior performance over existing algorithms.
11: \end{abstract}
12: