abstract:157966efb434e02c.tex

1: \begin{abstract}

2: In this paper, a lifelong learning problem is studied for an Internet of Things (IoT) system.

3: In the considered model, each IoT device aims to balance its information freshness and energy consumption tradeoff by controlling its computational resource allocation at each time slot under dynamic environments.

4: An unmanned aerial vehicle (UAV) is deployed as a flying base station so as to enable the IoT devices to adapt to novel environments.

5: To this end, a new lifelong reinforcement learning algorithm, used by the UAV, is proposed in order to adapt the operation of the devices at each visit by the UAV.

6: By using the experience from previously visited devices and environments, the UAV can help devices adapt  faster to future states of their environment.

7: To do so, a knowledge base shared by all devices is maintained at the UAV. Simulation results show that the proposed algorithm can converge $25\%$ to $50\%$ faster than a policy gradient baseline algorithm that optimizes each device's decision making problem in isolation.

8:

9: %AoI and energy in isolation.

10: %yield a $25\%\sim50\%$ improvement in the convergence speed, compared to policy gradient baseline algorithm that optimizes each device's AoI and energy in isolation.

11: %improve the convergence speed by 25%-50%,

12: \end{abstract}

13: