b88b44be25761922.tex
1: \begin{abstract}
2: 
3: 
4: 
5: 
6:  
7: % \ac{UAV} are increasingly utilized across various applications such as surveillance, agriculture, disaster management, and communication infrastructure maintenance, necessitating efficient trajectory planning and optimization. A significant obstacle to \ac{UAV} navigation lies in ensuring reliable communication with terrestrial infrastructure which makes the trajectory planning even a harder optimization problems to solve. 
8: \ac{DRL} emerges as a prime solution for \ac{UAV} trajectory planning, offering proficiency in navigating high-dimensional spaces, adaptability to dynamic environments, and making sequential decisions based on real-time feedback. Despite these advantages, the use of \ac{DRL} for \ac{UAV} trajectory planning requires significant retraining when the UAV is confronted with a new environment, resulting in wasted resources and time. Therefore, it is essential to develop techniques that can reduce the overhead of retraining \ac{DRL} models, enabling them to adapt to constantly changing environments. This paper presents a novel method to reduce the need for extensive retraining using a \ac{DDQN} model as a pre-trained base, which is subsequently adapted to different urban environments through \ac{CTL}. Our method involves transferring the learned model weights and adapting the learning parameters, including the learning and exploration rates, to suit each new environment's specific characteristics. The effectiveness of our approach is validated in three scenarios, each with different levels of similarity. \ac{CTL} significantly improves learning speed and success rates compared to \ac{DDQN} models initiated from scratch. For similar environments, \ac{TL} improved stability, accelerated convergence by 65\%, and facilitated 35\% faster adaptation in dissimilar settings.
9: 
10: 
11: 
12: 
13: \end{abstract}
14: