next up previous
Next: Experiments: Towers of Hanoi Up: Bias-Optimal Incremental Problem Solving Previous: OOPS For Finding Universal


A Particular Initial Programming Language

The efficient search and backtracking mechanism described in section 2.1 is not aware of the nature of the particular programming language given by $Q$, the set of initial instructions for modifying states. The language could be list-oriented such as LISP, or based on matrix operations for neural network-like parallel architectures, etc. For the experiments we wrote an interpreter for an exemplary, stack-based, universal programming language inspired by FORTH [8], whose disciples praise its beauty and the compactness of its programs.

Each task's tape holds its state: various stack-like data structures represented as sequences of integers, including a data stack ds (with stack pointer dp) for function arguments, an auxiliary data stack Ds, a function stack fns of entries describing (possibly recursive) functions defined by the system itself, a callstack cs (with stack pointer cp and top entry $cs[cp]$) for calling functions, where local variable $cs[cp].ip$ is the current instruction pointer, and base pointer $cs[cp].dp$ points into ds below the values considered as arguments of the most recent function call: Any instruction of the form inst ( $x_1, \ldots, x_n$) expects its $n$ arguments on top of ds, and replaces them by its return values. Illegal use of any instruction will cause the currently tested program prefix to halt. In particular, it is illegal to set variables (such as stack pointers or instruction pointers) to values outside their prewired ranges, or to pop empty stacks, or to divide by 0, or to call nonexistent functions, or to change probabilities of nonexistent tokens, etc. Try (Section 2.1) will interrupt prefixes as soon as their $t>TP$.

Instructions. We defined 68 instructions, such as oldq(n) for calling the $n$-th previously found program $q^n$, or getq(n) for making a copy of $q^n$ on stack ds (e.g., to edit it with additional instructions). Lack of space prohibits to explain all instructions (see [9]) -- we have to limit ourselves to the few appearing in solutions found in the experiments, using readable names instead of their numbers: Instruction c1() returns constant 1. Similarly for c2(), ..., c5(). dec(x) returns $x-1$; by2(x) returns $2x$; grt(x,y) returns 1 if $x>y$, otherwise 0; delD() decrements stack pointer Dp of Ds; fromD() returns the top of Ds; toD() pushes the top entry of $ds$ onto Ds; cpn(n) copies the n topmost ds entries onto the top of ds, increasing dp by $n$; cpnb(n) copies $n$ ds entries above the $cs[cp].dp$-th ds entry onto the top of ds; exec(n) interprets $n$ as the number of an instruction and executes it; bsf(n) considers the entries on stack ds above its $cs[cp].dp+n$-th entry as code and uses callstack cs to call this code (code is executed by step 1 of Try (Section 2.1), one instruction at a time; the instruction ret() causes a return to the address of the next instruction right after the calling instruction). Given $n$ input arguments on ds, instruction defnp() pushes onto ds the begin of a definition of a procedure with $n$ inputs; this procedure returns if its topmost input is 0, otherwise decrements it. callp() pushes onto ds code for a call of the most recently defined function / procedure. Both defnp and callp also push code for making a fresh copy of the inputs of the most recently defined code, expected on top of ds. endnp() pushes code for returning from the current call, then calls the code generated so far on stack ds above the $n$ inputs, applying the code to a copy of the inputs on top of $ds$. boostq(i) sequentially goes through all tokens of the $i$-th self-discovered frozen program, boosting each token's probability by adding $n_Q$ to its enumerator and also to the denominator shared by all instruction probabilities -- denominator and all numerators are stored on tape, defining distribution $p(r)$.

Initialization. Given any task, we add task-specific instructions. We start with a maximum entropy distribution on the $>68$ $Q_i$ (all numerators set to 1), then insert substantial prior bias by assigning the lowest (easily computable) instruction numbers to the task-specific instructions, and by boosting (see above) the initial probabilities of appropriate ``small number pushers'' (such as c1, c2, c3) that push onto ds the numbers of the task-specific instructions, such that they become executable as part of code on ds. We also boost the probabilities of the simple arithmetic instructions by2 and dec, such that the system can easily create other integers from the probable ones (e.g., code sequence (c3 by2 by2 dec) will return integer 11). Finally we also boost boostq.


next up previous
Next: Experiments: Towers of Hanoi Up: Bias-Optimal Incremental Problem Solving Previous: OOPS For Finding Universal
Juergen Schmidhuber 2003-02-25


Back to OOPS home page