No Arabic abstract
In order to meet the constantly increasing demand by mobile terminals for higher data rates with limited wireless spectrum resource, cognitive radio and spectrum aggregation technologies have attracted much attention due to its capacity in improving spectrum efficiency. Combing cognitive relay and spectrum aggregation technologies, in this paper, we propose a dynamic spectrum aggregation strategy based on the Markov Prediction of the state of spectrum for the cooperatively relay networks on a multi-user and multi-relay scenario aiming at ensuring the user channel capacity and maximizing the network throughput. The spectrum aggregation strategy is executed through two steps. First, predict the state of spectrum through Markov prediction. Based on the prediction results of state of spectrum, a spectrum aggregation strategy is proposed. Simulation results show that the spectrum prediction process can observably lower the outage rate, and the spectrum aggregation strategy can greatly improve the network throughput.
With the development of the 5G and Internet of Things, amounts of wireless devices need to share the limited spectrum resources. Dynamic spectrum access (DSA) is a promising paradigm to remedy the problem of inefficient spectrum utilization brought upon by the historical command-and-control approach to spectrum allocation. In this paper, we investigate the distributed DSA problem for multi-user in a typical multi-channel cognitive radio network. The problem is formulated as a decentralized partially observable Markov decision process (Dec-POMDP), and we proposed a centralized off-line training and distributed on-line execution framework based on cooperative multi-agent reinforcement learning (MARL). We employ the deep recurrent Q-network (DRQN) to address the partial observability of the state for each cognitive user. The ultimate goal is to learn a cooperative strategy which maximizes the sum throughput of cognitive radio network in distributed fashion without coordination information exchange between cognitive users. Finally, we validate the proposed algorithm in various settings through extensive experiments. From the simulation results, we can observe that the proposed algorithm can converge fast and achieve almost the optimal performance.
In this paper, a novel spectrum association approach for cognitive radio networks (CRNs) is proposed. Based on a measure of both inference and confidence as well as on a measure of quality-of-service, the association between secondary users (SUs) in the network and frequency bands licensed to primary users (PUs) is investigated. The problem is formulated as a matching game between SUs and PUs. In this game, SUs employ a soft-decision Bayesian framework to detect PUs signals and, eventually, rank them based on the logarithm of the a posteriori ratio. A performance measure that captures both the ranking metric and rate is further computed by the SUs. Using this performance measure, a PU evaluates its own utility function that it uses to build its own association preferences. A distributed algorithm that allows both SUs and PUs to interact and self-organize into a stable match is proposed. Simulation results show that the proposed algorithm can improve the sum of SUs rates by up to 20 % and 60 % relative to the deferred acceptance algorithm and random channel allocation approach, respectively. The results also show an improved convergence time.
Traditional concept of cognitive radio is the coexistence of primary and secondary user in multiplexed manner. we consider the opportunistic channel access scheme in IEEE 802.11 based networks subject to the interference mitigation scenario. According to the protocol rule and due to the constraint of message passing, secondary user is unaware of the exact state of the primary user. In this paper, we have proposed an online algorithm for the secondary which assist determining a backoff counter or the decision of being idle for utilizing the time/frequency slot unoccupied by the primary user. Proposed algorithm is based on conventional reinforcement learning technique namely Q-Learning. Simulation has been conducted in order to prove the strength of this algorithm and also results have been compared with our contemporary solution of this problem where secondary user is aware of some states of primary user.
For Agent Based Models, in particular the Voter Model (VM), a general framework of aggregation is developed which exploits the symmetries of the agent network $G$. Depending on the symmetry group $Aut_{omega} (N)$ of the weighted agent network, certain ensembles of agent configurations can be interchanged without affecting the dynamical properties of the VM. These configurations can be aggregated into the same macro state and the dynamical process projected onto these states is, contrary to the general case, still a Markov chain. The method facilitates the analysis of the relation between microscopic processes and a their aggregation to a macroscopic level of description and informs about the complexity of a system introduced by heterogeneous interaction relations. In some cases the macro chain is solvable.
Designing clustered unmanned aerial vehicle (UAV) communication networks based on cognitive radio (CR) and reinforcement learning can significantly improve the intelligence level of clustered UAV communication networks and the robustness of the system in a time-varying environment. Among them, designing smarter systems for spectrum sensing and access is a key research issue in CR. Therefore, we focus on the dynamic cooperative spectrum sensing and channel access in clustered cognitive UAV (CUAV) communication networks. Due to the lack of prior statistical information on the primary user (PU) channel occupancy state, we propose to use multi-agent reinforcement learning (MARL) to model CUAV spectrum competition and cooperative decision-making problem in this dynamic scenario, and a return function based on the weighted compound of sensing-transmission cost and utility is introduced to characterize the real-time rewards of multi-agent game. On this basis, a time slot multi-round revisit exhaustive search algorithm based on virtual controller (VC-EXH), a Q-learning algorithm based on independent learner (IL-Q) and a deep Q-learning algorithm based on independent learner (IL-DQN) are respectively proposed. Further, the information exchange overhead, execution complexity and convergence of the three algorithms are briefly analyzed. Through the numerical simulation analysis, all three algorithms can converge quickly, significantly improve system performance and increase the utilization of idle spectrum resources.