Next: Global Optimality Theorem
Up: Essential Details of One
Previous: Proof Techniques
Important Instructions Used by Proof Techniques
The nature of the six proof-modifying instructions below
(there are no others)
makes it impossible to insert an
incorrect theorem into proof,
thus trivializing proof verification.
Let us first provide a brief overview of the most
important instructions: get-axiom(n)
appends the -th possible axiom
to the current proof,
apply-rule(k, m, n) applies the -th
inference rule to the -th and -th theorem
in the current proof (appending the result),
tests whether the last theorem in proof
is a target theorem showing that a self-rewrite
through switchprog would be useful.
The details are as follows.
takes as argument an integer computed
by a prefix of the currently tested proof technique
with the help of arithmetic instructions
such as those used in previous work
Then it appends
the -th axiom (if it exists, according to the axiom scheme below)
as a theorem to the current theorem
sequence in proof. The initial axiom scheme encodes:
- Hardware axioms
describing the hardware,
formally specifying how
certain components of (other than
environmental inputs ) may
change from one cycle to the next.
For example, if the hardware is a
Turing machine2(TM) ,
then is a bitstring that encodes the current
contents of all tapes of the TM, the positions of its
scanning heads, and the current internal state
of the TM's finite state automaton, while specifies
the TM's look-up table which maps any possible combination
of internal state and bits above scanning heads
to a new internal state and an action such as:
replace some head's current bit by 1/0, increment
(right shift) or decrement (left shift) some
scanning head, read and copy next input bit to cell above
input tape's scanning head,
Alternatively, if the hardware is given by the abstract model
of a modern microprocessor with limited storage, will
encode the current storage contents,
instruction pointers etc.
For instance, the following axiom could
describe how some 64-bit hardware's instruction pointer
stored in is continually incremented as long as
there is no overflow and the value
of does not indicate that a jump to some other address
should take place:
Here the semantics of used symbols such
as `(' and `' and `' (implies)
are the traditional ones, while `'
symbolizes a function
translating bitstrings into numbers.
It is clear that any abstract hardware model can be
fully axiomatized in a similar way.
- Reward axioms
defining the computational costs of any
and physical costs of output
actions, such as control signals
encoded in .
Related axioms assign values to certain input events
(encoded in variable , a substring of )
representing reward or punishment (e.g.,
when a Gödel machine-controlled robot bumps into an obstacle).
Additional axioms define the total value of the Gödel machine's life as a
scalar-valued function of all
rewards (e.g., their sum) and
costs experienced between cycles and , etc.
For example, assume that
can be changed only through
external inputs; the following example axiom
says that the total reward increases by 3 whenever
such an input equals `11'
(unexplained symbols carry the obvious meaning):
where is interpreted as
the cumulative reward between times and .
It is clear that any formal scheme for producing
rewards can be fully axiomatized in a similar way.
- Environment axioms
restricting the way the environment will produce
new inputs (encoded within certain substrings of ) in reaction to
sequences of outputs encoded in .
For example, it may be known
in advance that the environment is sampled from an unknown probability
distribution contained in a given set
of possible distributions (compare equation 1).
E.g., may contain all distributions
that are computable, given the previous history
or at least limit-computable [41,42].
Or, more restrictively, the environment
may be some unknown but deterministic computer
sampled from the Speed Prior  which assigns
low probability to environments that are hard to compute by any method.
Or the interface to the environment is Markovian ,
that is, the current
input always uniquely identifies the environmental state--a lot
of work has already been done
on this special case [33,2,55].
Even more restrictively, the environment may evolve in completely
predictable fashion known in advance.
All such prior assumptions
are perfectly formalizable in an appropriate
(otherwise we could not write scientific papers about them).
- Uncertainty axioms; string manipulation axioms:
Standard axioms for arithmetics and calculus
and probability theory 
and string manipulation that (in conjunction
with the hardware axioms and environment axioms) allow for constructing proofs
concerning (possibly uncertain) properties of future values of
as well as bounds on expected remaining lifetime /
costs / rewards,
given some time and certain
hypothetical values for components of etc.
An example theorem saying something about
expected properties of future inputs might look like this:
represents a conditional probability
with respect to an axiomatized prior distribution from
a set of distributions described by
the environment axioms (Item 1c).
Given a particular
and formalizable assumptions about the
possibly probabilistic environment
obviously one can fully axiomatize
everything that is needed for
proof-based reasoning about past and future machine states.
- Initial state axioms:
Information about how
to reconstruct the initial state or parts thereof,
such that the proof searcher
can build proofs including
axioms of the type
Here and in the remainder of the paper
we use bold font in formulas to indicate
syntactic place holders (such as m,n,z)
for symbol strings representing
variables (such as m,n,z)
whose semantics are explained in
the text--in the present context is the
Note that it is no fundamental problem to fully encode
both the hardware description and the initial
itself. To see this, observe that some software may
include a program that can print the software.
- Utility axioms
describing the overall goal
in the form of utility function ; e.g., equation (1) in
apply-rule(k, m, n)
takes as arguments the index (if it exists) of
an inference rule
such as modus ponens
(stored in a list of possible inference rules
encoded within ) and the indices
of two previously proven theorems (numbered in order of
their creation) in the current proof.
If applicable, the corresponding inference rule is
applied to the addressed theorems
and the resulting theorem appended to proof. Otherwise
the currently tested proof technique is interrupted. This
ensures that proof is never fed with invalid proofs.
deletes the -th theorem in the currently stored
proof, thus freeing storage such that proof-storing parts of
can be reused and the maximal proof size is not necessarily limited by
Theorems deleted from proof, however, cannot be addressed any more by
apply-rule to produce further prolongations of proof.
replaces by , provided
that is indeed a non-empty substring of
, the storage writable by proof techniques.
verifies whether the goal of the proof search has been reached.
First it tests whether the last theorem (if any) in proof
has the form of a target theorem. A target theorem states
that given the current axiomatized utility
function (Item 1f), the utility of a
switch from to the current switchprog
would be higher than the utility of continuing the execution
of (which would keep searching for alternative switchprogs).
Target theorems are
(encoded in )
of the (decoded) form
where the variable (represented by syntactic place holder )
stands for a time step,
while all other symbols belong to the alphabet of
the theorem-proving calculus, that is,
the set of possible target theorems is parameterized only by .
Here the calculus should permit the notation
a shortcut for the state obtained when we replace
the true value of the variable bit
(encoded in )
at time , by
. This will
facilitate the formulation of theorems
that compare values conditioned on various alternative
hypothetical properties of . (Note that may
be only partially known by the current proof technique
even in environments where
and switchbit() are
fully predetermined for all valid .)
The purpose of introducing is to deal with hardware-specific
temporal delays that may be involved in checking and switching--it may
take a significant amount of
time to match abstract symbol strings found during proof search to the
Gödel machine's real current state.
If a target theorem has been found, check() uses a
simple prewired subroutine (also encoded in , of course)
to check whether there is enough time left
to set variable switchbit (originally 0) to 1 before
the continually increasing will equal .
If this subroutine
returns a negative result, check() exits.
Otherwise it sets switchbit
(there is no other way of changing switchbit).
Then it repeatedly tests until , to make sure
the condition of formula (2)
was fulfilled at .
Then it transfers control to switchprog
(there is no other way of calling switchprog).
may subsequently rewrite all parts of , excluding hardware-reserved
parts such as and , but including .
takes two integer arguments
and tries to transform the current contents of
into a theorem of the form
where represents a time measured (by checking time)
shortly after state2theorem was invoked,
and the bistring (recall the special
case of Item 1e).
So we accept the time-labeled current
observable contents of any part of as a theorem that does not have
to be proven in an alternative way from, say, the
initial state , because the computation so far
has already demonstrated that the theorem is true.
Thus we may exploit information conveyed
by environmental inputs, and the fact that sometimes
(but not always) the fastest way
to determine the output of a program is to run it.
This non-traditional online interface between syntax and semantics
requires special care though.
We must avoid inconsistent results through
parts of that change while being read.
For example, the present value of a quickly changing
instruction pointer IP (continually updated by the hardware)
may be essentially unreadable in the
sense that the execution of the reading subroutine
itself will already modify IP many times.
For convenience, the (typically limited) hardware could be set up
such that it stores the contents of
fast hardware variables every cycles in
a reserved part of , such that
an appropriate variant of state2theorem() could at least
translate certain recent values of fast variables into theorems.
This, however, will not abolish all problems associated
For example, the to be read might
also contain the reading procedure's
own, temporary, constantly changing string pointer variables,
etc.3To address such problems on computers with limited
first uses some fixed protocol
(encoded in , of course)
to check whether the current is readable
at all or whether it might change if it
were read by the remaining code of state2theorem.
If so, or if are not in the proper range,
then the instruction has no further effect. Otherwise it
appends an observed theorem of the form
For example, if the current time is 7770000, then the invocation
of state2theorem(6,9) might return the theorem
reflects the time needed by state2theorem
to perform the initial check
and to read leading bits off the continually increasing
(reading also costs time) such that
it can be sure that is a recent proper time label following
the start of state2theorem.
The axiomatic system is a defining
parameter of a given Gödel machine. Clearly, must be
strong enough to permit proofs of target theorems.
In particular, the theory of uncertainty axioms
(Item 1d) must be sufficiently rich.
This is no fundamental problem: we simply
insert all traditional axioms of probability theory .
Next: Global Optimality Theorem
Up: Essential Details of One
Previous: Proof Techniques