Conclusion

In 1931, Kurt Gödel laid the foundations of theoretical
computer science, using
elementary arithmetics to build a universal programming language
for encoding arbitrary proofs, given an arbitrary enumerable set of
axioms. He went on to construct *self-referential* formal statements that claim their own unprovability,
using Cantor's diagonalization trick [5]
to demonstrate that formal systems such as
traditional mathematics are either flawed in a certain
sense or contain unprovable but true statements [11].
Since Gödel's exhibition of the fundamental limits of proof and
computation, and Konrad Zuse's subsequent construction of the first working
programmable computer (1935-1941),
there has been a lot of work on specialized algorithms
solving problems taken from more or less general problem classes.
Apparently, however,
one remarkable fact has so far escaped the attention of computer
scientists: it is possible to use self-referential proof systems to
build optimally efficient yet conceptually very simple
universal problem solvers.

The initial software of our Gödel machine
runs an initial, typically sub-optimal problem solver, e.g., one of
Hutter's approaches
[16,17]
which have at least an optimal *order* of complexity,
or some less general method [20].
Simultaneously, it runs an -optimal initial proof
searcher using an online variant of *Universal Search*
to test *proof techniques*, which are
programs able to compute proofs
concerning the system's own future performance,
based on an axiomatic system
encoded in ,
describing a formal *utility* function ,
the hardware and itself.
If there is no provably good, globally optimal
way of rewriting at all, then humans
will not find one either. But if there is one,
then itself can find
and exploit it. This approach yields the first
class of theoretically sound, fully self-referential,
optimally efficient, general problem solvers.

After the theoretical discussion in Sections 1-5, one practical question remains: to build a particular, especially practical Gödel machine with small initial constant overhead, which generally useful theorems should one add as axioms to (as initial bias) such that the initial searcher does not have to prove them from scratch?