The method introduced above still has elements of global control: There is the clock for synchronous updates, for instance. However, the bucket brigade credit assignment concept is also potentially relevant for continuous time models of neural processing. To come closer to asynchronous models from biology we now give up the assumption of predefined competitive subsets and of instant decay. To save the concept of winning units we explicitly introduce fixed inhibitory connections (e.g. a variant of the on-center-off-surround structure (see [Kohonen, 1988] and [Grossberg, 1976]).

We assume that the output of unit and the transmission properties of the excitatory connections are governed by differential equations that say that does not change significantly during the time needed to transport activation information from one unit to its successors. Then we write down a continuous time version of the weight changes caused by the neural bucket brigade in case of being greater than zero:

Only positive weights appear in this formula, the inhibitory connections have to remain fixed. Tentatively denoting by we find (by letting ) that the weight-flow through a positive weight that does not receive external payoff has reached a dynamic equilibrium if equals all the time.

It should be noted that there is an important difference between a continuous time version based on local on-center-off-surround wiring, and the discrete time version above. While the discrete time version assumes instant activation decay when the input to a competitive subset disappears, there will be no activation decay in case of on-center-off-surround structures. It remains to be seen whether the bucket brigade mechanism can sensibly work in case of such hysteresis effects. The only experiments conducted so far were based on the discrete time version (see below).

Back to Reinforcement Learning Economy page

Back to Recurrent Neural Networks page