next up previous
Next: Basic Cycle of Operations Up: Exploring the Predictable In Previous: Acknowledgments


Appendix

In what follows I will describe details of the system used in the experiments.

Architecture. The internal state $\cal S$ consists of $m$ addressable cells with addresses ranging from 0 to $m-1$ (instead of ranging from 1 to $m$ -- this is due to a pecularity of C, the implementation language). $\cal S$ $_k \in \{-M, -M+1, \ldots,0, 1,
\ldots, M \}$ are the current contents of the cell with address $k$. Instructions and arguments are encoded by a fixed set $I$ of $n$ integer values $\{0, \ldots, n-1 \}$. For each value $j$ in $I$, there is an instruction head $a_j$ with $n_j < BS - 2$ integer-valued arguments, where $BS$ is the instruction blocksize, and $m$ is a multiple of $BS$. In the experiments I use $BS = 9$ (there are at most six arguments per instruction), $n = 24$, $m =$ $ BS (n^2~div~ BS)$ $ = 576,$ and $M = 100,000$. See list below for instruction syntax and semantics.

RIGHT and LEFT modules. All ${\sc Right}_i$ and ${\sc Left }_i$ ( $i \in \{0, \ldots, m-1 \}$) are vectors of $n$ positive, real values that sum up to 1.0. The $k$-th component of ${\sc Right}_i$ (${\sc Left }_i$) is denoted ${\sc Right}_{i,k}$ ( ${\sc Left }_{i,k}$) for $k \in \{0, \ldots, n-1 \}$. A variable InstructionPointer (IP) with range $\{0, \ldots, m-1
\}$ always points to one of the module pair's columns. IP is viewed as a modifiable part of the environment.

Initialization. At system birth at time 0, all ${\sc Right}_{i,k}$ and ${\sc Left }_{i,k}$ are set equal to $1/n$. All $\cal S$$_k$ and IP are set to zero. They will never be re-initialized again. To be able to restore modified module columns if necessary, we introduce two initially empty stacks Stack${\sc Right}$ and Stack${\sc Left }$ that allow for variable-sized stack entries, and the conventional push and pop operations. Instructions may change the Boolean variables BlockSSALEFT and BlockSSARIGHT (both are modifiable parts of $\cal E$ and initially FALSE at time 0).



Subsections
next up previous
Next: Basic Cycle of Operations Up: Exploring the Predictable In Previous: Acknowledgments
Juergen Schmidhuber 2003-03-10


Back to Active Learning - Exploration - Curiosity page