next up previous
Next: WRITES WITH 2 ARGUMENTS Up: APPLICATION: FINDING ``SIMPLE'' NEURAL Previous: A PERCEPTRON FOR COUNTING

A PERCEPTRON FOR ADDING INPUT POSITIONS

The task. We use the same perceptron-like network and the same input data as above. The goal is different, however. The task is to find weights such that $y^p$ equals the sum of the positions of on-bits in $x^p$, for all ${100 \choose 3} = 161,700$ possible $x^p$. Again, the task will be made difficult by providing only very limited training data.

The solution. The only solution to the problem is: make all $w_i$ equal to $i$. Like with the example above, there are short and fast programs for computing the solution.

The training data. The 3 training inputs $x^1$, $x^2$, and $x^3$ from the previous task are used. The target values are different, however. Obviously, the target for input vector $x^1$ is 108. The target for input vector $x^2$ is 126. The target for input vector $x^3$ is 221. Again, success is binary: only if the solution candidate fits the 3 training examples exactly, the solution is evaluated on the test data.

Why conventional neural net algorithms fail to solve this problem: for the same reasons they fail to solve the previous problem (see section 4.3). The training set is too small to obtain reasonable generalization on the test set.

RESULTS. Programs fitting the training data were found in 10 out of $5.5*10^{7}$ runs, using up a total search time of $8.14*10^{8}$ time steps. 8 of the 10 successful runs led to perfect generalization on the 161,697 unseen test examples.

The first weight vector fitting the training data was found after 6,902,963 runs. Again, the corresponding program was a pretty wild one. But it led to perfect generalization on all the test data. Before halting, the program used 502 out of 8192 allocated time steps. Its time probability was $2^{-8}$. Its space probability was $3.92*10^{-16}$. Table 3 shows the used part of the storage after execution. What the program does is this:


Table 3: Used storage after execution of a program for the adding perceptron.
Addresses: -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14  
Contents: 100 0 0 7 3 5 11 -2 1 -3 0 -3 9 11 8 -3 2 -1  


(0) Allocate 3 cells on the work tape. Initialize with zero. Set Min = Min - 3.

(2) Get the contents of the input field (see list of instructions in section 3)

at position 11 (which is 0), and write it into address -2.

(5) Write the contents of address -3 onto the weight pointed to by WeightPointer

and increment WeightPointer. Halt if WeightPointer is out of range.

(7) If the contents of address -3 is less or equal to the contents of address 9,

goto address 11. Otherwise goto address 11.

(11) Increment the contents of address -3.

(13) Goto address -1.

(-1) If the contents of address 7 is less or equal to the contents of address 3

(always true), goto address 5.

The instructions beginning at the addresses (2), (7), and (-1) are useless. But at least they are not catastrophic. Essentially, the program first allocates space for a variable (initially zero) on the work tape (recall that the program tape is ``read/execute'' only, and cannot be used for variables). Then it executes a loop for incrementing and writing the variable contents onto the network's weight vector.

After $4.6*10^{7}$ runs, a faster and rather elegant (nearly minimal) program was found. Table 4 shows the used storage after execution. The program ran for 302 out of 512 allocated time steps. Its space probability is $9.65*10^{-10}$. Inspection will reveal the operation of the program.


Table 4: Used storage after execution of a more elegant program for the adding perceptron.
Addresses: -1 0 1 2 3 4 5 6 7 8 9  
Contents: 100 7 1 1 -1 8 -1 0 1 2 2  


Using different sets of 3 training examples (obtained by randomly permuting the input units that are never on) led to very similar generalization results.


next up previous
Next: WRITES WITH 2 ARGUMENTS Up: APPLICATION: FINDING ``SIMPLE'' NEURAL Previous: A PERCEPTRON FOR COUNTING
Juergen Schmidhuber 2003-02-12


Back to Optimal Universal Search page
Back to Program Evolution page
Back to Algorithmic Information page
Back to Speed Prior page