abstract:ed2b8408911d11b1.tex

1: \begin{abstract}

2: Deep reinforcement learning (DRL) has been shown to be successful in many application domains.

3: %

4: Combining recurrent neural networks (RNNs) and DRL further enables DRL to be applicable in non-Markovian environments by capturing temporal information.

5: %

6: However, training of both DRL and RNNs is known to be challenging requiring a large amount of training data to achieve convergence.

7: %

8: In many targeted applications, such as those used in the fifth generation (5G) cellular communication, the environment is highly dynamic while the available training data is very limited.

9: %

10: Therefore, it is extremely important to develop DRL strategies that are capable of capturing the temporal correlation of the dynamic environment requiring limited training overhead.

11: %

12: In this paper, we introduce the deep echo state Q-network (DEQN) that can adapt to the highly dynamic environment in a short period of time with limited training data.

13: %

14: We evaluate the performance of the introduced DEQN method under the dynamic spectrum sharing (DSS) scenario, which is a promising technology in 5G and future 6G networks to increase the spectrum utilization.

15: %

16: Compared to conventional spectrum management policy that grants a fixed spectrum band to a single system for exclusive access, DSS allows the secondary system to share the spectrum with the primary system.

17: %

18: Our work sheds light on the application of an efficient DRL framework in highly dynamic environments with limited available training data.

19: \end{abstract}

20: