ﻻ يوجد ملخص باللغة العربية
Reinforcement learning involves decision making in dynamic and uncertain environments and constitutes a crucial element of artificial intelligence. In our previous work, we experimentally demonstrated that the ultrafast chaotic oscillatory dynamics of lasers can be used to solve the two-armed bandit problem efficiently, which requires decision making concerning a class of difficult trade-offs called the exploration-exploitation dilemma. However, only two selections were employed in that research; thus, the scalability of the laser-chaos-based reinforcement learning should be clarified. In this study, we demonstrated a scalable, pipelined principle of resolving the multi-armed bandit problem by introducing time-division multiplexing of chaotically oscillated ultrafast time-series. The experimental demonstrations in which bandit problems with up to 64 arms were successfully solved are presented in this report. Detailed analyses are also provided that include performance comparisons among laser chaos signals generated in different physical conditions, which coincide with the diffusivity inherent in the time series. This study paves the way for ultrafast reinforcement learning by taking advantage of the ultrahigh bandwidths of light wave and practical enabling technologies.
Optical interconnect is a potential solution to attain the large bandwidth on-chip communications needed in high performance computers in a low power and low cost manner. Mode-division multiplexing (MDM) is an emerging technology that scales the capa
Conventional stereoscopic displays suffer from vergence-accommodation conflict and cause visual fatigue. Integral-imaging-based displays resolve the problem by directly projecting the sub-aperture views of a light field into the eyes using a microlen
The subset sum problem is a typical NP-complete problem that is hard to solve efficiently in time due to the intrinsic superpolynomial-scaling property. Increasing the problem size results in a vast amount of time consuming in conventionally availabl
Fabricating powerful neuromorphic chips the size of a thumb requires miniaturizing their basic units: synapses and neurons. The challenge for neurons is to scale them down to submicrometer diameters while maintaining the properties that allow for rel
Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks). Our contribution, Melting Pot, is a MARL evaluation suite th