ترغب بنشر مسار تعليمي؟ اضغط هنا

Responding to Illegal Activities Along the Canadian Coastlines Using Reinforcement Learning

61   0   0.0 ( 0 )
 نشر من قبل Wail Gueaieb
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

This article elaborates on how machine learning (ML) can leverage the solution of a contemporary problem related to the security of maritime domains. The worldwide ``Illegal, Unreported, and Unregulated (IUU) fishing incidents have led to serious environmental and economic consequences which involve drastic changes in our ecosystems in addition to financial losses caused by the depletion of natural resources. The Fisheries and Aquatic Department (FAD) of the United Nations Food and Agriculture Organization (FAO) issued a report which indicated that the annual losses due to IUU fishing reached $25 Billion. This imposes negative impacts on the future-biodiversity of the marine ecosystem and domestic Gross National Product (GNP). Hence, robust interception mechanisms are increasingly needed for detecting and pursuing the unrelenting illegal fishing incidents in maritime territories. This article addresses the problem of coordinating the motion of a fleet of marine vessels (pursuers) to catch an IUU vessel while still in local waters. The problem is formulated as a pursuer-evader problem that is tackled within an ML framework. One or more pursuers, such as law enforcement vessels, intercept an evader (i.e., the illegal fishing ship) using an online reinforcement learning mechanism that is based on a value iteration process. It employs real-time navigation measurements of the evader ship as well as those of the pursuing vessels and returns back model-free interception strategies.



قيم البحث

اقرأ أيضاً

Inefficient traffic signal control methods may cause numerous problems, such as traffic congestion and waste of energy. Reinforcement learning (RL) is a trending data-driven approach for adaptive traffic signal control in complex urban traffic networ ks. Although the development of deep neural networks (DNN) further enhances its learning capability, there are still some challenges in applying deep RLs to transportation networks with multiple signalized intersections, including non-stationarity environment, exploration-exploitation dilemma, multi-agent training schemes, continuous action spaces, etc. In order to address these issues, this paper first proposes a multi-agent deep deterministic policy gradient (MADDPG) method by extending the actor-critic policy gradient algorithms. MADDPG has a centralized learning and decentralized execution paradigm in which critics use additional information to streamline the training process, while actors act on their own local observations. The model is evaluated via simulation on the Simulation of Urban MObility (SUMO) platform. Model comparison results show the efficiency of the proposed algorithm in controlling traffic lights.
Unmanned aerial vehicles (UAVs) are now beginning to be deployed for enhancing the network performance and coverage in wireless communication. However, due to the limitation of their on-board power and flight time, it is challenging to obtain an opti mal resource allocation scheme for the UAV-assisted Internet of Things (IoT). In this paper, we design a new UAV-assisted IoT systems relying on the shortest flight path of the UAVs while maximising the amount of data collected from IoT devices. Then, a deep reinforcement learning-based technique is conceived for finding the optimal trajectory and throughput in a specific coverage area. After training, the UAV has the ability to autonomously collect all the data from user nodes at a significant total sum-rate improvement while minimising the associated resources used. Numerical results are provided to highlight how our techniques strike a balance between the throughput attained, trajectory, and the time spent. More explicitly, we characterise the attainable performance in terms of the UAV trajectory, the expected reward and the total sum-rate.
137 - Wei Cui , Wei Yu 2020
This paper proposes a novel scalable reinforcement learning approach for simultaneous routing and spectrum access in wireless ad-hoc networks. In most previous works on reinforcement learning for network optimization, the network topology is assumed to be fixed, and a different agent is trained for each transmission node -- this limits scalability and generalizability. Further, routing and spectrum access are typically treated as separate tasks. Moreover, the optimization objective is usually a cumulative metric along the route, e.g., number of hops or delay. In this paper, we account for the physical-layer signal-to-interference-plus-noise ratio (SINR) in a wireless network and further show that bottleneck objective such as the minimum SINR along the route can also be optimized effectively using reinforcement learning. Specifically, we propose a scalable approach in which a single agent is associated with each flow and makes routing and spectrum access decisions as it moves along the frontier nodes. The agent is trained according to the physical-layer characteristics of the environment using a novel rewarding scheme based on the Monte Carlo estimation of the future bottleneck SINR. It learns to avoid interference by intelligently making joint routing and spectrum allocation decisions based on the geographical location information of the neighbouring nodes.
Stop-and-go traffic poses many challenges to tranportation system, but its formation and mechanism are still under exploration.however, it has been proved that by introducing Connected Automated Vehicles(CAVs) with carefully designed controllers one could dampen the stop-and-go waves in the vehicle fleet. Instead of using analytical model, this study adopts reinforcement learning to control the behavior of CAV and put a single CAV at the 2nd position of a vehicle fleet with the purpose to dampen the speed oscillation from the fleet leader and help following human drivers adopt more smooth driving behavior. The result show that our controller could decrease the spped oscillation of the CAV by 54% and 8%-28% for those following human-driven vehicles. Significant fuel consumption savings are also observed. Additionally, the result suggest that CAVs may act as a traffic stabilizer if they choose to behave slightly altruistically.
We demonstrate, for the first time, experimental over-the-fiber training of transmitter neural networks (NNs) using reinforcement learning. Optical back-to-back training of a novel NN-based digital predistorter outperforms arcsine-based predistortion with up to 60% bit-error-rate reduction.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا