abstract:0557706bbc12b110.tex

1: \begin{abstract}

2: %Methods in \ac{OLTR} have so far only allowed for the optimization of linear models. We introduce a novel method with a differentiable loss, that allows for any differentiable model to be optimized.

3:

4: \acf{OLTR} methods optimize rankers based on user interactions.

5: % efficiently and reliably.

6: State-of-the-art \acs{OLTR} methods are built specifically for linear models.

7: Their approaches do not extend well to non-linear models such as neural networks.

8: We introduce an entirely novel approach to \acs{OLTR} that constructs a weighted differentiable pairwise loss after each interaction: \acf{\OurMethod}.

9: \ac{\OurMethod} breaks away from the traditional approach that relies on interleaving or multileaving and extensive sampling of models to estimate gradients.

10: Instead, its gradient is based on inferring preferences between document pairs from user clicks and can optimize any differentiable model.

11: We prove that the gradient of \acs{\OurMethod} is unbiased w.r.t.\ user document pair preferences.

12: Our experiments on the largest publicly available \ac{LTR} datasets show considerable and significant improvements under all levels of interaction noise.

13: \acs{\OurMethod} outperforms existing \ac{OLTR} methods both in terms of learning speed as well as final convergence.

14: Furthermore, unlike previous \ac{OLTR} methods, \ac{\OurMethod} also allows for non-linear models to be optimized effectively.

15: Our results show that using a neural network leads to even better performance at convergence than a linear model.

16: In summary, \acs{\OurMethod} is an efficient and unbiased \ac{OLTR} approach that provides a better user experience than previously possible.

17: \end{abstract}

18: