abstract:872590ed2c42e74c.tex

1: \begin{abstract}

2: Counterfactual evaluation can estimate \ac{CTR} differences between ranking systems based on historical interaction data, while mitigating the effect of position bias and item-selection bias.

3: We introduce the novel \acf{LogOpt}, which optimizes the policy for logging data so that the counterfactual estimate has minimal variance.

4: As minimizing variance leads to faster convergence, \ac{LogOpt} increases the data-efficiency of counterfactual estimation.

5: \ac{LogOpt} turns the counterfactual approach -- which is indifferent to the logging policy -- into an online approach, where the algorithm decides what rankings to display.

6: We prove that, as an online evaluation method, \ac{LogOpt} is unbiased w.r.t.\ position and item-selection bias, unlike existing interleaving methods.

7: Furthermore, we perform large-scale experiments by simulating comparisons between \emph{thousands} of rankers.

8: Our results show that while interleaving methods make systematic errors, \ac{LogOpt} is as efficient as interleaving without being biased.

9: \end{abstract}

10: