All traditional algorithms for problem solving / machine learning / reinforcement learning  are hardwired. Some are designed to improve some limited type of policy through experience, but are not part of the modifiable policy, and cannot improve themselves in a theoretically sound way. Humans are needed to create new / better problem solving algorithms and to prove their usefulness under appropriate assumptions.
Here we will eliminate the restrictive need for human effort in the most general way possible, leaving all the work including the proof search to a system that can rewrite and improve itself in arbitrary computable ways and in a most efficient fashion. To attack this ``Grand Problem of Artificial Intelligence,'' we introduce a novel class of optimal, fully self-referential  general problem solvers called Gödel machines [45,46].1They are universal problem solving systems that interact with some (partially observable) environment and can in principle modify themselves without essential limits apart from the limits of computability. Their initial algorithm is not hardwired; it can completely rewrite itself, but only if a proof searcher embedded within the initial algorithm can first prove that the rewrite is useful, given a formalized utility function reflecting computation time and expected future success (e.g., rewards). We will see that self-rewrites due to this approach are actually globally optimal (Theorem 4.1, Section 4), relative to Gödel's well-known fundamental restrictions of provability . These restrictions should not worry us; if there is no proof of some self-rewrite's utility, then humans cannot do much either.
The initial proof searcher is -optimal (has an optimal order of complexity) in the sense of Theorem 5.1, Section 5. Unlike hardwired systems such as Hutter's [16,17] and Levin's [24,26] (Section 6.4), however, a Gödel machine can in principle speed up any part of its initial software, including its proof searcher, to meet arbitrary formalizable notions of optimality beyond those expressible in the -notation. Our approach yields the first theoretically sound, fully self-referential, optimal, general problem solvers.
Outline. Section 2 presents basic concepts and fundamental limitations, Section 3 the essential details of a self-referential axiomatic system, Section 4 the Global Optimality Theorem 4.1, and Section 5 the -optimal (Theorem 5.1) initial proof searcher. Section 6 provides examples and relations to previous work, briefly discusses issues such as a technical justification of consciousness, and lists answers to several frequently asked questions about Gödel machines.