ﻻ يوجد ملخص باللغة العربية
Animal brains evolved to optimize behavior in dynamically changing environments, selecting actions that maximize future rewards. A large body of experimental work indicates that such optimization changes the wiring of neural circuits, appropriately mapping environmental input onto behavioral outputs. A major unsolved scientific question is how optimal wiring adjustments, which must target the connections responsible for rewards, can be accomplished when the relation between sensory inputs, action taken, environmental context with rewards is ambiguous. The computational problem of properly targeting cues, contexts and actions that lead to reward is known as structural, contextual and temporal credit assignment respectively. In this review, we survey prior approaches to these three types of problems and advance the notion that the brains specialized neural architectures provide efficient solutions. Within this framework, the thalamus with its cortical and basal ganglia interactions serve as a systems-level solution to credit assignment. Specifically, we propose that thalamocortical interaction is the locus of meta-learning where the thalamus provides cortical control functions that parametrize the cortical activity association space. By selecting among these control functions, the basal ganglia hierarchically guide thalamocortical plasticity across two timescales to enable meta-learning. The faster timescale establishes contextual associations to enable rapid behavioral flexibility while the slower one enables generalization to new contexts. Incorporating different thalamic control functions under this framework clarifies how thalamocortical-basal ganglia interactions may simultaneously solve the three credit assignment problems.
Backpropagation (BP) uses detailed, unit-specific feedback to train deep neural networks (DNNs) with remarkable success. That biological neural circuits appear to perform credit assignment, but cannot implement BP, implies the existence of other powe
The success of deep learning sparked interest in whether the brain learns by using similar techniques for assigning credit to each synaptic weight for its contribution to the network output. However, the majority of current attempts at biologically-p
We consider the problem of efficient credit assignment in reinforcement learning. In order to efficiently and meaningfully utilize new data, we propose to explicitly assign credit to past decisions based on the likelihood of them having led to the ob
We address the problem of credit assignment in reinforcement learning and explore fundamental questions regarding the way in which an agent can best use additional computation to propagate new information, by planning with internal models of the worl
Credit value adjustment (CVA) is the charge applied by financial institutions to the counterparty to cover the risk of losses on a counterpart default event. In this paper we estimate such a premium under the Bates stochastic model (Bates [4]), which