ترغب بنشر مسار تعليمي؟ اضغط هنا

Open Problems in Cooperative AI

81   0   0.0 ( 0 )
 نشر من قبل Kevin McKee
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Problems of cooperation--in which agents seek ways to jointly improve their welfare--are ubiquitous and important. They can be found at scales ranging from our daily routines--such as driving on highways, scheduling meetings, and working collaboratively--to our global challenges--such as peace, commerce, and pandemic preparedness. Arguably, the success of the human species is rooted in our ability to cooperate. Since machines powered by artificial intelligence are playing an ever greater role in our lives, it will be important to equip them with the capabilities necessary to cooperate and to foster cooperation. We see an opportunity for the field of artificial intelligence to explicitly focus effort on this class of problems, which we term Cooperative AI. The objective of this research would be to study the many aspects of the problems of cooperation and to innovate in AI to contribute to solving these problems. Central goals include building machine agents with the capabilities needed for cooperation, building tools to foster cooperation in populations of (machine and/or human) agents, and otherwise conducting AI research for insight relevant to problems of cooperation. This research integrates ongoing work on multi-agent systems, game theory and social choice, human-machine interaction and alignment, natural-language processing, and the construction of social tools and platforms. However, Cooperative AI is not the union of these existing areas, but rather an independent bet about the productivity of specific kinds of conversations that involve these and other areas. We see opportunity to more explicitly focus on the problem of cooperation, to construct unified theory and vocabulary, and to build bridges with adjacent communities working on cooperation, including in the natural, social, and behavioural sciences.



قيم البحث

اقرأ أيضاً

Computer games represent an ideal research domain for the next generation of personalized digital applications. This paper presents a player-centered framework of AI for game personalization, complementary to the commonly used system-centered approac hes. Built on the Structure of Actions theory, the paper maps out the current landscape of game personalization research and identifies eight open problems that need further investigation. These problems require deep collaboration between technological advancement and player experience design.
Game AI competitions are important to foster research and development on Game AI and AI in general. These competitions supply different challenging problems that can be translated into other contexts, virtual or real. They provide frameworks and tool s to facilitate the research on their core topics and provide means for comparing and sharing results. A competition is also a way to motivate new researchers to study these challenges. In this document, we present the Geometry Friends Game AI Competition. Geometry Friends is a two-player cooperative physics-based puzzle platformer computer game. The concept of the game is simple, though its solving has proven to be difficult. While the main and apparent focus of the game is cooperation, it also relies on other AI-related problems such as planning, plan execution, and motion control, all connected to situational awareness. All of these must be solved in real-time. In this paper, we discuss the competition and the challenges it brings, and present an overview of the current solutions.
Recent superhuman results in games have largely been achieved in a variety of zero-sum settings, such as Go and Poker, in which agents need to compete against others. However, just like humans, real-world AI systems have to coordinate and communicate with other agents in cooperative partially observable environments as well. These settings commonly require participants to both interpret the actions of others and to act in a way that is informative when being interpreted. Those abilities are typically summarized as theory f mind and are seen as crucial for social interactions. In this paper we propose two different search techniques that can be applied to improve an arbitrary agreed-upon policy in a cooperative partially observable game. The first one, single-agent search, effectively converts the problem into a single agent setting by making all but one of the agents play according to the agreed-upon policy. In contrast, in multi-agent search all agents carry out the same common-knowledge search procedure whenever doing so is computationally feasible, and fall back to playing according to the agreed-upon policy otherwise. We prove that these search procedures are theoretically guaranteed to at least maintain the original performance of the agreed-upon policy (up to a bounded approximation error). In the benchmark challenge problem of Hanabi, our search technique greatly improves the performance of every agent we tested and when applied to a policy trained using RL achieves a new state-of-the-art score of 24.61 / 25 in the game, compared to a previous-best of 24.08 / 25.
The ability to create artificial intelligence (AI) capable of performing complex tasks is rapidly outpacing our ability to ensure the safe and assured operation of AI-enabled systems. Fortunately, a landscape of AI safety research is emerging in resp onse to this asymmetry and yet there is a long way to go. In particular, recent simulation environments created to illustrate AI safety risks are relatively simple or narrowly-focused on a particular issue. Hence, we see a critical need for AI safety research environments that abstract essential aspects of complex real-world applications. In this work, we introduce the AI safety TanksWorld as an environment for AI safety research with three essential aspects: competing performance objectives, human-machine teaming, and multi-agent competition. The AI safety TanksWorld aims to accelerate the advancement of safe multi-agent decision-making algorithms by providing a software framework to support competitions with both system performance and safety objectives. As a work in progress, this paper introduces our research objectives and learning environment with reference code and baseline performance metrics to follow in a future work.
The travelling thief problem (TTP) is a representative of multi-component optimisation problems with interacting components. TTP combines the knapsack problem (KP) and the travelling salesman problem (TSP). A thief performs a cyclic tour through a se t of cities, and pursuant to a collection plan, collects a subset of items into a rented knapsack with finite capacity. The aim is to maximise profit while minimising renting cost. Existing TTP solvers typically solve the KP and TSP components in an interleaved manner: the solution of one component is kept fixed while the solution of the other component is modified. This suggests low coordination between solving the two components, possibly leading to low quality TTP solutions. The 2-OPT heuristic is often used for solving the TSP component, which reverses a segment in the tour. Within TTP, 2-OPT does not take into account the collection plan, which can result in a lower objective value. This in turn can result in the tour modification to be rejected by a solver. We propose an expanded form of 2-OPT to change the collection plan in coordination with tour modification. Items regarded as less profitable and collected in cities located earlier in the reversed segment are substituted by items that tend to be more profitable and not collected in cities located later in the reversed segment. The collection plan is further changed through a modified form of the hill-climbing bit-flip search, where changes in the collection state are only permitted for boundary items, which are defined as lowest profitable collected items or highest profitable uncollected items. This restriction reduces the time spent on the KP component, allowing more tours to be evaluated by the TSP component within a time budget. The proposed approaches form the basis of a new cooperative coordination solver, which is shown to outperform several state-of-the-art TTP solvers.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا