abstract:096c17115d545cc5.tex

1: \begin{abstract}

2: Recent work has proposed stochastic Plackett-Luce (PL) ranking models as a robust choice for optimizing relevance and fairness metrics.

3: Unlike their deterministic counterparts that require heuristic optimization algorithms, PL models are fully differentiable.

4: Theoretically, they can be used to optimize ranking metrics via stochastic gradient descent.

5: However, in practice, the computation of the gradient is infeasible because it requires one to iterate over all possible permutations of items.

6: Consequently, actual applications rely on approximating the gradient via sampling techniques.

7:

8: In this paper, we introduce a novel algorithm: PL-Rank, that estimates the gradient of a PL ranking model w.r.t.\ both relevance and fairness metrics.

9: Unlike existing approaches that are based on policy gradients, PL-Rank makes use of the specific structure of PL models and ranking metrics.

10: Our experimental analysis shows that PL-Rank has a greater sample-efficiency and is computationally less costly than existing policy gradients, resulting in faster convergence at higher performance.

11: PL-Rank further enables the industry to apply PL models for more relevant and fairer real-world ranking systems.

12: \end{abstract}

13: