Deep Reinforcement Learning with Spatio-temporal Traffic Forecasting for Data-Driven Base Station Sleep Control

71 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xu Chen

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Qiong Wu - Xu Chen - Zhi Zhou

بنية الشبكات والإنترنت الذكاء الاصطناعي النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

To meet the ever increasing mobile traffic demand in 5G era, base stations (BSs) have been densely deployed in radio access networks (RANs) to increase the network coverage and capacity. However, as the high density of BSs is designed to accommodate peak traffic, it would consume an unnecessarily large amount of energy if BSs are on during off-peak time. To save the energy consumption of cellular networks, an effective way is to deactivate some idle base stations that do not serve any traffic demand. In this paper, we develop a traffic-aware dynamic BS sleep control framework, named DeepBSC, which presents a novel data-driven learning approach to determine the BS active/sleep modes while meeting lower energy consumption and satisfactory Quality of Service (QoS) requirements. Specifically, the traffic demands are predicted by the proposed GS-STN model, which leverages the geographical and semantic spatial-temporal correlations of mobile traffic. With accurate mobile traffic forecasting, the BS sleep control problem is cast as a Markov Decision Process that is solved by Actor-Critic reinforcement learning methods. To reduce the variance of cost estimation in the dynamic environment, we propose a benchmark transformation method that provides robust performance indicator for policy update. To expedite the training process, we adopt a Deep Deterministic Policy Gradient (DDPG) approach, together with an explorer network, which can strengthen the exploration further. Extensive experiments with a real-world dataset corroborate that our proposed framework significantly outperforms the existing methods.

قيم البحث

96 - Yanan Wang , Tong Xu , Xin Niu 2019

The development of intelligent traffic light control systems is essential for smart transportation management. While some efforts have been made to optimize the use of individual traffic lights in an isolated way, related studies have largely ignored the fact that the use of multi-intersection traffic lights is spatially influenced and there is a temporal dependency of historical traffic status for current traffic light control. To that end, in this paper, we propose a novel SpatioTemporal Multi-Agent Reinforcement Learning (STMARL) framework for effectively capturing the spatio-temporal dependency of multiple related traffic lights and control these traffic lights in a coordinating way. Specifically, we first construct the traffic light adjacency graph based on the spatial structure among traffic lights. Then, historical traffic records will be integrated with current traffic status via Recurrent Neural Network structure. Moreover, based on the temporally-dependent traffic information, we design a Graph Neural Network based model to represent relationships among multiple traffic lights, and the decision for each traffic light will be made in a distributed way by the deep Q-learning method. Finally, the experimental results on both synthetic and real-world data have demonstrated the effectiveness of our STMARL framework, which also provides an insightful understanding of the influence mechanism among multi-intersection traffic lights.

أنظمة متعددة العملاء الذكاء الاصطناعي التعلم الآلي

CVLight: Deep Reinforcement Learning for Adaptive Traffic Signal Control with Connected Vehicles

130 - Wangzhi Li , Yaxing Cai , Ujwal Dinesha 2021

This paper develops a reinforcement learning (RL) scheme for adaptive traffic signal control (ATSC), called CVLight, that leverages data collected only from connected vehicles (CV). Seven types of RL models are proposed within this scheme that contai n various state and reward representations, including incorporation of CV delay and green light duration into state and the usage of CV delay as reward. To further incorporate information of both CV and non-CV into CVLight, an algorithm based on actor-critic, A2C-Full, is proposed where both CV and non-CV information is used to train the critic network, while only CV information is used to update the policy network and execute optimal signal timing. These models are compared at an isolated intersection under various CV market penetration rates. A full model with the best performance (i.e., minimum average travel delay per vehicle) is then selected and applied to compare with state-of-the-art benchmarks under different levels of traffic demands, turning proportions, and dynamic traffic demands, respectively. Two case studies are performed on an isolated intersection and a corridor with three consecutive intersections located in Manhattan, New York, to further demonstrate the effectiveness of the proposed algorithm under real-world scenarios. Compared to other baseline models that use all vehicle information, the trained CVLight agent can efficiently control multiple intersections solely based on CV data and can achieve a similar or even greater performance when the CV penetration rate is no less than 20%.

التعلم الآلي الذكاء الاصطناعي أنظمة وتحكم

A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization

103 - Zhenning Li , Chengzhong Xu , Guohui Zhang 2021

Inefficient traffic signal control methods may cause numerous problems, such as traffic congestion and waste of energy. Reinforcement learning (RL) is a trending data-driven approach for adaptive traffic signal control in complex urban traffic networ ks. Although the development of deep neural networks (DNN) further enhances its learning capability, there are still some challenges in applying deep RLs to transportation networks with multiple signalized intersections, including non-stationarity environment, exploration-exploitation dilemma, multi-agent training schemes, continuous action spaces, etc. In order to address these issues, this paper first proposes a multi-agent deep deterministic policy gradient (MADDPG) method by extending the actor-critic policy gradient algorithms. MADDPG has a centralized learning and decentralized execution paradigm in which critics use additional information to streamline the training process, while actors act on their own local observations. The model is evaluated via simulation on the Simulation of Urban MObility (SUMO) platform. Model comparison results show the efficiency of the proposed algorithm in controlling traffic lights.

معالجة الإشارات الذكاء الاصطناعي التعلم الآلي

Base Station Network Traffic Prediction Approach Based on LMA-DeepAR

87 - Jiachen Zhang , Xingquan zuo , Mingying Xu 2021

Accurate network traffic prediction of base station cell is very vital for the expansion and reduction of wireless devices in base station cell. The burst and uncertainty of base station cell network traffic makes the network traffic nonlinear and no n-stationary, which brings challenges to the long-term prediction of network traffic. In this paper, the traffic model LMA-DeepAR for base station network is established based on DeepAR. Acordding to the distribution characteristics of network traffic, this paper proposes an artificial feature sequence calculation method based on local moving average (LMA). The feature sequence is input into DeepAR as covariant, which makes the statistical characteristics of network traffic near a period of time in the past be considered when updating parameters, and the interference of non-stationary network traffic on model training will be reduced. Experimental results show that the proposed prediction approach (LMA-DeepAR) outperforms other methods in the overall long-term prediction performance and stability of multi cell network traffic.

بنية الشبكات والإنترنت

Defining Traffic States using Spatio-temporal Traffic Graphs

63 - Debaditya Roy , K. Naveen Kumar , C. Krishna Mohan 2020

Intersections are one of the main sources of congestion and hence, it is important to understand traffic behavior at intersections. Particularly, in developing countries with high vehicle density, mixed traffic type, and lane-less driving behavior, i t is difficult to distinguish between congested and normal traffic behavior. In this work, we propose a way to understand the traffic state of smaller spatial regions at intersections using traffic graphs. The way these traffic graphs evolve over time reveals different traffic states - a) a congestion is forming (clumping), the congestion is dispersing (unclumping), or c) the traffic is flowing normally (neutral). We train a spatio-temporal deep network to identify these changes. Also, we introduce a large dataset called EyeonTraffic (EoT) containing 3 hours of aerial videos collected at 3 busy intersections in Ahmedabad, India. Our experiments on the EoT dataset show that the traffic graphs can help in correctly identifying congestion-prone behavior in different spatial regions of an intersection.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي معالجة الإشارات