386cd895d50c2dbb.tex
1: \begin{abstract}
2:   % ----- importance of contextual bandit ------
3:   The contextual bandit problem is a theoretically justified framework with wide applications in various fields.
4:   % ------- limitation of previous work -----
5:   While the previous study on this problem usually requires independence between noise and contexts,
6:   % ------- our generalization -----
7:   our work considers a more sensible setting where the noise becomes a latent confounder that affects both contexts and rewards.
8:   % -------- the importance of this improvement ------
9:   Such a confounded setting is more realistic and could expand to a broader range of applications.
10:   % --------challenges ------------
11:   However, the unresolved confounder will cause a bias in reward function estimation and thus lead to a large regret.
12:   % -------- link ------------
13:   To deal with the challenges brought by the confounder, 
14:   % --------- our method ------
15:   we apply the dual instrumental variable regression, which can correctly identify the true reward function. 
16:   % --------- one theoretical result -------
17:   We prove the convergence rate of this method is near-optimal in two types of widely used reproducing kernel Hilbert spaces. 
18:   % --------- about bandit -------
19:   Therefore, we can design computationally efficient and regret-optimal algorithms based on the theoretical guarantees for confounded bandit problems.
20:   % --------- numerical ---------
21:   The numerical results illustrate the efficacy of our proposed algorithms in the confounded bandit setting.
22: \end{abstract}
23: