9ad7d718320aa46b.tex
1: \begin{abstract} In many recommendation applications such as news
2: recommendation, the items that can be recommended come and go at a
3: very fast pace.
4: This is a challenge for recommender systems (RS) to face this setting.
5: % In this context, it is very hard to build recommender
6: %systems (RS) based on the classical methods {\color{red}XXXc'est
7: %  koi?XXX} and 
8:   Online learning algorithms seem to be the most straight
9: forward solution. The contextual bandit framework was introduced for
10: that very purpose. In general the evaluation of a RS is a critical
11: issue. Live evaluation is often avoided due to the potential loss of
12: revenue, hence the need for offline evaluation methods. Two options
13: are available. Model based methods are biased by nature and are thus
14: difficult to trust when used alone. Data driven methods are therefore
15: what we consider here. Evaluating online learning algorithms with past
16: data is not simple but some methods exist in the
17: literature. Nonetheless their accuracy is not satisfactory mainly due
18: to their mechanism of data rejection that only allow the exploitation
19: of a small fraction of the data. We precisely address this issue in
20: this paper. After highlighting the limitations of the previous
21: methods, we present a new method, based on bootstrapping
22: techniques. This new method comes with two important improvements: it
23: is much more accurate and it provides a measure of quality of its
24: estimation. The latter is a highly desirable property in order to
25: minimize the risks entailed by putting online a RS for the first
26: time. We provide both theoretical and experimental proofs of its
27: superiority compared to state-of-the-art methods, as well as an
28: analysis of the convergence of the measure of quality.
29: \end{abstract}
30: