Next: SSA Calls
Up: Appendix
Previous: Appendix
Basic Cycle of Operations
Until unknown time (system death), the system repeats the
following basic instruction cycle over and over.
- 1.
- Select instruction head with probability ,
where
Here the collective decision function maps
real-valued to real values. Given an appropriate , each
module may ``veto'' instructions suggested by the other module.
Only instructions that are strongly supported by both modules are
highly likely to be selected. One possibility is
. In the experiments I use .
Comment: owing to pecularities of certain instructions to be
introduced below, will later be refined for cases where
addresses an instruction head as opposed to an argument.
- 2.
- 's arguments
are
selected according to probability distributions
(except when Bet! -- two of Bet!'s arguments will be treated
differently -- see Section A.3.4 below).
- 3.
- Execute the selected instruction. This will consume time and may
change (1) environment , (2) IP, (3) internal state
; (4a) , (4b) . If there is
external reward then set
(rewards
become visible to the system in the form of inputs).
- 4.
- If an input has changed one of the cell contents ,
, then shift the contents of
, ,, to components
, ,,,
respectively. This results in a built-in short-term memory
(long-term memory can be implemented by the system itself by
executing appropriate instruction sequences).
- 5.
- If did not modify IP (no conditional jump -- compare
instruction list below), then compute the address of the next
instruction head by setting IP
.
Here
, where
is selected according to probability distribution , while
is selected according
to .
- 6.
- Goto 1.
Next: SSA Calls
Up: Appendix
Previous: Appendix
Juergen Schmidhuber
2003-03-10
Back to Active Learning - Exploration - Curiosity page