next up previous
Next: INCREMENTAL LEARNING Up: ``SIMPLE'' NEURAL NETS Previous: TASK 2: ADDING INPUT

WRITES WITH 2 ARGUMENTS

We keep the task from section 4.2. The primitive ``WriteWeight'' is redefined, however, and gets an additional argument. The primitive ``GetInput'' is redefined and gets a new name: ``ReadWeight''. There is no separate WeightPointer any more, and no automatic increment mechanism for WeightPointer's position. Instead, the new primitives may directly address, read and write the network's weights. The other primitives remain the same. Here are the two new ones, together with their instruction numbers (compare section 5):

1
WriteWeight(address1, address2). $w_i$ is set equal to the contents of address1, where $i$ is the value found in address2.
5
ReadWeight(address1, address2). $w_i$ is written into the address found at location address1, where $i$ is the value found in address2.

Appropriate syntax checks halt programs whenever they attempt to do something impossible, like writing a non-existent weight. Since the new ``WriteWeight'' primitive has an additional argument (to be guessed correctly), successful programs tend to be less likely.

RESULTS. Out of $10^{8}$ runs using up a total search time of $1.436*10^{9}$ time steps, 3 runs generated weight vectors fitting the training data. All of them allowed for perfect generalization on all the test data. During execution, one of them filled the storage as seen in table 1. The program ran for 399 out of 1024 allocated time steps. Its space probability was $9.95*10^{-9}$. What it does is this:


Table: Used storage after execution of a successful program for the adding perceptron, using a ``WriteWeight'' primitive with two arguments.
Addresses: -100 -99 ... -3 -2 -1 0 1 2 3 4 5 6 7 8
Contents: 0 0 ... 0 0 100 7 1 8 -1 1 -1 -1 2 0


(0) Allocate one cell on the work tape. Initialize with zero. Set Min = Min-1.

(2) Increment the contents of address -1.

(4) Make $w_{c_{-1}}$ equal to the contents of address -1.

(7) Jump to address 0.

Repeated execution of the instruction at address 0 unnecessarily allocates 100 cells of the work tape but does not do any damage other than slightly slowing down the program. Using different sets of 3 training examples (obtained by randomly permuting the input units that are never on), led to very similar generalization results.

Numerous additional experiments (including rather complicated maze tasks) were performed by Norbert Jankowski and Stefan Heil at TUM. Some are described e.g. in Heil's diploma thesis (1995).

General remarks. The bias towards algorithmic simplicity is a very general one. It is weaker than most kinds of problem specific inductive bias, e.g. (Utgoff, 1986; Haussler, 1988). If a solution is indeed simple, the bias is justified (it does not require us to know ``the way in which the solution is simple''). If the solution is not simple, the bias towards algorithmic simplicity won't do much damage: even in case of algorithmically complex solutions we cannot lose much if we focus our search on simple candidates first, before looking at more complex candidates. This is because in general the complex candidates greatly outnumber the simple ones. The few simple ones don't significantly affect total search time of an optimal search algorithm.


next up previous
Next: INCREMENTAL LEARNING Up: ``SIMPLE'' NEURAL NETS Previous: TASK 2: ADDING INPUT
Juergen Schmidhuber 2003-02-25


Back to Optimal Universal Search page
Back to Program Evolution page
Back to Algorithmic Information page
Back to Speed Prior page