Proof. Target theorem (2)
implicitly talks about all the other
switchprogs that the proof searcher
could produce in the future. To see this, consider
the two alternatives of the binary decision:
(1) either execute the current switchprog (set switchbit
), or
(2) keep searching for proofs and switchprogs (set switchbit
)
until the systematic
searcher comes up with an even better switchprog.
Obviously the second alternative concerns all (possibly infinitely
many) potential switchprogs to be considered later. That is,
if the current switchprog were not the `best', then
the proof searcher would not be able to prove that
setting switchbit and
executing switchprog will cause higher expected reward
than discarding switchprog, assuming consistency of
.
Q.E.D.
The initial proof searcher of Section 5
already generates all possible proofs and switchprogs in
-optimal fashion.
Nevertheless, since it is part of
, its proofs can speak about the
proof searcher itself, possibly triggering proof searcher rewrites
resulting in better than merely
-optimal performance.