1: \begin{abstract}
2: Given a finite family of functions, the goal of model
3: selection aggregation is to
4: construct a procedure that mimics the function from this
5: family that is the closest to an unknown regression function.
6: More precisely, we consider a general regression model with fixed
7: design and measure the distance between functions by the mean squared
8: error at the design points.
9: While procedures based on exponential weights are known to solve the problem
10: of model selection aggregation in expectation, they are, surprisingly,
11: sub-optimal in deviation.
12: We propose a new formulation called $Q$-aggregation that addresses this
13: limitation;
14: namely, its solution leads to sharp oracle inequalities that are
15: optimal in a minimax sense.
16: Moreover, based on the new formulation, we design greedy
17: $Q$-aggregation procedures
18: that produce sparse aggregation models achieving the optimal rate.
19: The convergence and performance of these greedy procedures
20: are illustrated and compared with other standard methods on simulated examples.
21: \end{abstract}