No Arabic abstract
Reinforcement learning involves decision making in dynamic and uncertain environments and constitutes a crucial element of artificial intelligence. In our previous work, we experimentally demonstrated that the ultrafast chaotic oscillatory dynamics of lasers can be used to solve the two-armed bandit problem efficiently, which requires decision making concerning a class of difficult trade-offs called the exploration-exploitation dilemma. However, only two selections were employed in that research; thus, the scalability of the laser-chaos-based reinforcement learning should be clarified. In this study, we demonstrated a scalable, pipelined principle of resolving the multi-armed bandit problem by introducing time-division multiplexing of chaotically oscillated ultrafast time-series. The experimental demonstrations in which bandit problems with up to 64 arms were successfully solved are presented in this report. Detailed analyses are also provided that include performance comparisons among laser chaos signals generated in different physical conditions, which coincide with the diffusivity inherent in the time series. This study paves the way for ultrafast reinforcement learning by taking advantage of the ultrahigh bandwidths of light wave and practical enabling technologies.
Optical interconnect is a potential solution to attain the large bandwidth on-chip communications needed in high performance computers in a low power and low cost manner. Mode-division multiplexing (MDM) is an emerging technology that scales the capacity of a single wavelength carrier by the number of modes in a multimode waveguide, and is attractive as a cost-effective means for high bandwidth density on-chip communications. Advanced modulation formats with high spectral efficiency in MDM networks can further improve the data rates of the optical link. Here, we demonstrate an intra-chip MDM communications link employing advanced modulation formats with two waveguide modes. We demonstrate a compact single wavelength carrier link that is expected to support 2x100 Gb/s mode multiplexed capacity. The network comprised integrated microring modulators at the transmitter, mode multiplexers, multimode waveguide interconnect, mode demultiplexers and integrated germanium on silicon photodetectors. Each of the mode channels achieves 100 Gb/s line rate with 84 Gb/s net payload data rate at 7% overhead for hard-decision forward error correction (HD-FEC) in the OFDM/16-QAM signal transmission.
Conventional stereoscopic displays suffer from vergence-accommodation conflict and cause visual fatigue. Integral-imaging-based displays resolve the problem by directly projecting the sub-aperture views of a light field into the eyes using a microlens array or a similar structure. However, such displays have an inherent trade-off between angular and spatial resolutions. In this paper, we propose a novel coded time-division multiplexing technique that projects encoded sub-aperture views to the eyes of a viewer with correct cues for vergence-accommodation reflex. Given sparse light field sub-aperture views, our pipeline can provide a perception of high-resolution refocused images with minimal aliasing by jointly optimizing the sub-aperture views for display and the coded aperture pattern. This is achieved via deep learning in an end-to-end fashion by simulating light transport and image formation with Fourier optics. To our knowledge, this work is among the first that optimize the light field display pipeline with deep learning. We verify our idea with objective image quality metrics (PSNR and SSIM) and perform an extensive study on various customizable design variables in our display pipeline. Experimental results show that light fields displayed using the proposed technique indeed have higher quality than that of baseline display designs.
The subset sum problem is a typical NP-complete problem that is hard to solve efficiently in time due to the intrinsic superpolynomial-scaling property. Increasing the problem size results in a vast amount of time consuming in conventionally available computers. Photons possess the unique features of extremely high propagation speed, weak interaction with environment and low detectable energy level, therefore can be a promising candidate to meet the challenge by constructing an a photonic computer computer. However, most of optical computing schemes, like Fourier transformation, require very high operation precision and are hard to scale up. Here, we present a chip built-in photonic computer to efficiently solve the subset sum problem. We successfully map the problem into a waveguide network in three dimensions by using femtosecond laser direct writing technique. We show that the photons are able to sufficiently dissipate into the networks and search all the possible paths for solutions in parallel. In the case of successive primes the proposed approach exhibits a dominant superiority in time consumption even compared with supercomputers. Our results confirm the ability of light to realize a complicated computational function that is intractable with conventional computers, and suggest the subset sum problem as a good benchmarking platform for the race between photonic and conventional computers on the way towards photonic supremacy.
Fabricating powerful neuromorphic chips the size of a thumb requires miniaturizing their basic units: synapses and neurons. The challenge for neurons is to scale them down to submicrometer diameters while maintaining the properties that allow for reliable information processing: high signal to noise ratio, endurance, stability, reproducibility. In this work, we show that compact spin-torque nano-oscillators can naturally implement such neurons, and quantify their ability to realize an actual cognitive task. In particular, we show that they can naturally implement reservoir computing with high performance and detail the recipes for this capability.
Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks). Our contribution, Melting Pot, is a MARL evaluation suite that fills this gap, and uses reinforcement learning to reduce the human labor required to create novel test scenarios. This works because one agents behavior constitutes (part of) another agents environment. To demonstrate scalability, we have created over 80 unique test scenarios covering a broad range of research topics such as social dilemmas, reciprocity, resource sharing, and task partitioning. We apply these test scenarios to standard MARL training algorithms, and demonstrate how Melting Pot reveals weaknesses not apparent from training performance alone.