bbf0ae307ed558e9.tex
1: \begin{abstract}
2: We address the problem of \emph{online convex optimization (OCO) with dueling feedback} where, the learner is restricted to query a weaker form of relative feedback which only reveals a single $0/1$-bit binary comparison feedback (noisy) of two queried points. 
3: % 
4: This is unlike the traditional optimization setting which assumes an online oracle access to gradient or at least zeroth order function information, and despite our goal is still to find the optimal point with the least possible query complexity. 
5: %
6: The problem has been addressed previously for some restricted classes of preference feedback \cite{SKM21,Jamieson12}, but we consider a general degree-$p$ polynomial link function based preference class as the underlying dueling-feedback model. %
7: %
8: The main contribution of this work lies in proposing a runtime efficient algorithm with provably optimal convergence rates: In particular, for $\beta$-smooth convex functions, we show a convergence rate of $\tilde O(\epsilon^{-4p})$. 
9: %
10: Further, if the function is also $\alpha$-strongly convex in addition, we show a convergence rate of $\tilde O(\epsilon^{-2p})$, which is provably optimal.
11: Our work closes the open problem of function optimization with general polynomial link functions, beyond the sign feedback and strong convexity assumptions.
12: %
13: %The efficacy of our proposed algorithms are justified through empirical comparisons with an existing baseline. % which shows them to outperform the state of the art.
14: \end{abstract}
15: