abstract:774eed26d60ad8c8.tex

1: \begin{abstract}

2: This paper considers a resource allocation problem where several Internet-of-Things (IoT) devices send data to a base station (BS)

3: with or without the help of the reconfigurable intelligent surface (RIS) assisted cellular network.

4: The objective is to maximize the sum rate of all IoT devices by finding the optimal RIS and spreading factor (SF) for each device.

5: Since these IoT devices lack prior information on the RISs or the channel state information (CSI),

6: a distributed resource allocation framework with low complexity and learning features is required to achieve this goal.

7: Therefore, we model this problem as a two-stage multi-player multi-armed bandit (MPMAB) framework to learn the optimal RIS and SF sequentially.

8: Then, we put forth an exploration and exploitation boosting (E2Boost) algorithm to solve this two-stage MPMAB problem by combining the $\epsilon$-greedy algorithm, Thompson sampling (TS) algorithm, and non-cooperation game method.

9: We derive an upper regret bound for the proposed algorithm, i.e., $\mathcal{O}(\log^{1+\delta}_2 T)$, increasing logarithmically with the time horizon $T$.

10: Numerical results show that the E2Boost algorithm has the best performance among the existing methods and exhibits a fast convergence rate.

11: More importantly, the proposed algorithm is not sensitive to the number of combinations of the RISs and SFs thanks to the two-stage allocation mechanism,

12: which can benefit high-density networks.

13: \end{abstract}

14: