520decb95bbd5a15.tex
1: \begin{abstract}
2: Semantic communication (SemCom) has been deemed as a promising communication paradigm to break through the bottleneck of traditional communications. Nonetheless, most of the existing works focus more on point-to-point communication scenarios and its extension to multi-user scenarios is not that straightforward due to its cost-inefficiencies to directly scale the JSCC framework to the multi-user communication system. 
3: Meanwhile, previous methods optimize the system by differentiable bit-level supervision, easily leading to a ``semantic gap''.   
4: Therefore, we delve into multi-user broadcast communication (BC) based on the universal transformer (UT) and propose a reinforcement learning (RL) based self-critical alternate learning (SCAL) algorithm, named SemanticBC-SCAL, to capably adapt to the different BC channels from one transmitter (\tx) to multiple receivers (\rxs) for sentence generation task.  
5: In particular, to enable stable optimization via a non-differentiable semantic metric, we regard sentence similarity as a reward and formulate this learning process as an RL problem. Considering the huge decision space, we adopt a lightweight but efficient self-critical supervision to guide the learning process. Meanwhile, an alternate learning mechanism is developed to provide  
6: cost-effective learning, in which the encoder and decoders are updated asynchronously with different iterations. Notably, the incorporation of RL makes SemanticBC-SCAL compliant with any user-defined semantic similarity metric and simultaneously addresses the channel non-differentiability issue by alternate learning. Besides, the convergence of SemanticBC-SCAL is also theoretically established. Extensive simulation results have been conducted to verify the effectiveness and superiorness of our approach, especially in low SNRs.     
7:     
8: \end{abstract}
9: