ﻻ يوجد ملخص باللغة العربية
This paper introduces a machine learning based collaborative multi-band spectrum sensing policy for cognitive radios. The proposed sensing policy guides secondary users to focus the search of unused radio spectrum to those frequencies that persistently provide them high data rate. The proposed policy is based on machine learning, which makes it adaptive with the temporally and spatially varying radio spectrum. Furthermore, there is no need for dynamic modeling of the primary activity since it is implicitly learned over time. Energy efficiency is achieved by minimizing the number of assigned sensors per each subband under a constraint on miss detection probability. It is important to control the missed detections because they cause collisions with primary transmissions and lead to retransmissions at both the primary and secondary user. Simulations show that the proposed machine learning based sensing policy improves the overall throughput of the secondary network and improves the energy efficiency while controlling the miss detection probability.
We propose a successive convex approximation based off-policy optimization (SCAOPO) algorithm to solve the general constrained reinforcement learning problem, which is formulated as a constrained Markov decision process (CMDP) in the context of avera
Finding an optimal sensing policy for a particular access policy and sensing scheme is a laborious combinatorial problem that requires the system model parameters to be known. In practise the parameters or the model itself may not be completely known
Most reinforcement learning (RL) algorithms assume online access to the environment, in which one may readily interleave updates to the policy with experience collection using that policy. However, in many real-world applications such as health, educ
We propose a policy improvement algorithm for Reinforcement Learning (RL) which is called Rerouted Behavior Improvement (RBI). RBI is designed to take into account the evaluation errors of the Q-function. Such errors are common in RL when learning th
Lane-change maneuvers are commonly executed by drivers to follow a certain routing plan, overtake a slower vehicle, adapt to a merging lane ahead, etc. However, improper lane change behaviors can be a major cause of traffic flow disruptions and even