next up previous
Next: Bibliography Up: Semantics of Instruction Heads Previous: Primitive Learning Algorithms

Basic Cycle Modification

For didactic reasons I wait until the end of this appendix to introduce a slight change in the basic cycle's instruction selection procedure (compare Section A.1). In case $i$ addresses an instruction head as opposed to an argument, redefine

Q(i,j) = \frac{f({\sc Right}_{i,j}, {\sc Left }_{i,g(j)})} {\sum_k
f({\sc Right}_{i,k},{\sc Left }_{i,g(k)})},

i \in \{0, BS, 2BS, \ldots \},
j \in \{0, \ldots, n-1 \}.

If $a_i$ is IncProbRIGHT then $g(i)$ returns the index of $a_i$'s antagonistic instruction head IncProbLEFT. Similarly for the other pairs of antagonistic instruction heads: (DecProbRIGHT, DecProbLEFT), (MoveDistRIGHT, MoveDistLEFT), (GetRIGHT, GetLEFT), (EnableSSARIGHT, EnableSSALEFT). This is necessary because antagonistic instructions require special treatment to achieve module symmetry through SSAandCopy. For instance, suppose that LEFT's current advantage depends on supporting some IncProbRIGHT instruction. An equal RIGHT opponent should strongly support IncProbLEFT instead.

Juergen Schmidhuber 2003-03-10

Back to Active Learning - Exploration - Curiosity page