Performance Assessment of Reinforcement Learning Policies for Battery Lifetime Extension in Mobile Multi-RAT LPWAN Scenarios

Result type
journal article in Web of Science database
Description
Considering the dynamically changing nature of the radio propagation environment, the envisioned battery lifetime of the end device (ED) for massive machine-type communication (mMTC) stands for a critical challenge. As the selected radio technology bounds the battery lifetime, the possibility of choosing among several low-power wide-area (LPWAN) technologies integrated at a single ED may dramatically improve its lifetime. In this paper, we propose a novel approach of battery lifetime extension utilizing reinforcement learning (RL) policies. Notably, the system assesses the radio environment conditions and assigns the appropriate rewards to minimize the overall power consumption and increase reliability. To this aim, we carry out extensive propagation and power measurements campaigns at the city-scale level and then utilize these results for composing real-life use-cases for static and mobile deployments. Our numerical results show that RL-based techniques allow for a noticeable increase in EDs' battery lifetime when operating in multi-RAT mode. Furthermore, out of all considered schemes, the performance of the weighted average policy shows the most consistent results for both considered deployments. Specifically, all RL policies can achieve 90% of their maximum gain during the initialization phase for the stationary EDs while utilizing less than 50 messages. Considering the mobile deployment, the improvements in battery lifetime could reach 200%.
Keywords
LPWAN
Multi-RAT
End-device lifetime
Energy consumption optimization
Reinforcement learning