No Arabic abstract
The competitive multi-armed bandit (CMAB) problem is related to social issues such as maximizing total social benefits while preserving equality among individuals by overcoming conflicts between individual decisions, which could seriously decrease social benefits. The study described herein provides experimental evidence that entangled photons physically resolve the CMAB in the 2-arms 2-players case, maximizing the social rewards while ensuring equality. Moreover, we demonstrated that deception, or outperforming the other player by receiving a greater reward, cannot be accomplished in a polarization-entangled-photon-based system, while deception is achievable in systems based on classical polarization-correlated photons with fixed polarizations. Besides, random polarization-correlated photons have been studied numerically and shown to ensure equality between players and deception prevention as well, although the CMAB maximum performance is reduced as compared with entangled photon experiments. Autonomous alignment schemes for polarization bases were also experimentally demonstrated based only on decision conflict information observed by an individual without communications between players. This study paves a way for collective decision making in uncertain dynamically changing environments based on entangled quantum states, a crucial step toward utilizing quantum systems for intelligent functionalities.
Decision making is critical in our daily lives and for society in general and is finding evermore practical applications in information and communication technologies. Herein, we demonstrate experimentally that single photons can be used to make decisions in uncertain, dynamically changing environments. Using a nitrogen-vacancy in a nanodiamond as a single-photon source, we demonstrate the decision-making capability by solving the multi-armed bandit problem. This capability is directly and immediately associated with single-photon detection in the proposed architecture, leading to adequate and adaptive autonomous decision making. This study makes it possible to create systems that benefit from the quantum nature of light to perform practical and vital intelligent functions.
Decision making is a vital function in this age of machine learning and artificial intelligence, yet its physical realization and theoretical fundamentals are still not completely understood. In our former study, we demonstrated that single-photons can be used to make decisions in uncertain, dynamically changing environments. The two-armed bandit problem was successfully solved using the dual probabilistic and particle attributes of single photons. In this study, we present a category theoretic modeling and analysis of single-photon-based decision making, including a quantitative analysis that is in agreement with the experimental results. A category theoretic model reveals the complex interdependencies of subject matter entities in a simplified manner, even in dynamically changing environments. In particular, the octahedral and braid structures in triangulated categories provide a better understanding and quantitative metrics of the underlying mechanisms of a single-photon decision maker. This study provides both insight and a foundation for analyzing more complex and uncertain problems, to further machine learning and artificial intelligence.
The multi-armed bandit problem (MBP) is the problem of finding, as accurately and quickly as possible, the most profitable option from a set of options that gives stochastic rewards by referring to past experiences. Inspired by fluctuated movements of a rigid body in a tug-of-war game, we formulated a unique search algorithm that we call the `tug-of-war (TOW) dynamics for solving the MBP efficiently. The cognitive medium access, which refers to multi-user channel allocations in cognitive radio, can be interpreted as the competitive multi-armed bandit problem (CMBP); the problem is to determine the optimal strategy for allocating channels to users which yields maximum total rewards gained by all users. Here we show that it is possible to construct a physical device for solving the CMBP, which we call the `TOW Bombe, by exploiting the TOW dynamics existed in coupled incompressible-fluid cylinders. This analog computing device achieves the `socially-maximum resource allocation that maximizes the total rewards in cognitive medium access without paying a huge computational cost that grows exponentially as a function of the problem size.
Collective decision making is important for maximizing total benefits while preserving equality among individuals in the competitive multi-armed bandit (CMAB) problem, wherein multiple players try to gain higher rewards from multiple slot machines. The CMAB problem represents an essential aspect of applications such as resource management in social infrastructure. In a previous study, we theoretically and experimentally demonstrated that entangled photons can physically resolve the difficulty of the CMAB problem. This decision-making strategy completely avoids decision conflicts while ensuring equality. However, decision conflicts can sometimes be beneficial if they yield greater rewards than non-conflicting decisions, indicating that greedy actions may provide positive effects depending on the given environment. In this study, we demonstrate a mixed strategy of entangled- and correlated-photon-based decision-making so that total rewards can be enhanced when compared to the entangled-photon-only decision strategy. We show that an optimal mixture of entangled- and correlated-photon-based strategies exists depending on the dynamics of the reward environment as well as the difficulty of the given problem. This study paves the way for utilizing both quantum and classical aspects of photons in a mixed manner for decision making and provides yet another example of the supremacy of mixed strategies known in game theory, especially in evolutionary game theory.
Situations involving competition for resources among entities can be modeled by the competitive multi-armed bandit (CMAB) problem, which relates to social issues such as maximizing the total outcome and achieving the fairest resource repartition among individuals. In these respects, the intrinsic randomness and global properties of quantum states provide ideal tools for obtaining optimal solutions to this problem. Based on the previous study of the CMAB problem in the two-arm, two-player case, this paper presents the theoretical principles necessary to find polarization-entangled N-photon states that can optimize the total resource output while ensuring equality among players. These principles were applied to two-, three-, four-, and five-player cases by using numerical simulations to reproduce realistic configurations and find the best strategies to overcome potential misalignment between the polarization measurement systems of the players. Although a general formula for the N-player case is not presented here, general derivation rules and a verification algorithm are proposed. This report demonstrates the potential usability of quantum states in collective decision making with limited, probabilistic resources, which could serve as a first step toward quantum-based resource allocation systems.