next up previous
Next: Tighter Bounds? Up: Probability vs Descriptive Complexity Previous: Probability vs Descriptive Complexity

Novel Theorems for EOMs and GTMs

Theorem 5.3   For $x in B^{sharp}$ with $P^E(x) > 0$,
KP^E(x) - 1 \leq K^E(x) \leq KP^E(x) + K^E(KP^E(x)) + O(1).
\end{displaymath} (36)

Using $K^E(y) leq log y + 2log log y + O(1)$ for $y$ interpreted as an integer -- compare Def. 2.8 -- this yields
2^{-K^E(x)} < P^E(x) \leq O(2^{-K^E(x)})(K^E(x))^2.
\end{displaymath} (37)

That is, objects that are hard to describe (in the sense that they have only long enumerating descriptions) have low probability.

Proof. The left-hand inequality follows by definition. To show the right-hand side, one can build an EOM $T$ that computes $x in B^{sharp}$ using not more than $KP^E(x) + K_T(KP^E(x)) + O(1)$ input bits in a way inspired by Huffman-Coding [22]. The claim then follows from the Invariance Theorem. The trick is to arrange $T$'s computation such that $T$'s output converges yet never needs to decrease lexicographically. $T$ works as follows:

(A) Emulate $U^E$ to construct a real c.e. number $0.s$ encoded as a self-delimiting input program $r$, simultaneously run all (possibly forever running) programs on $U^E$ dovetail style; whenever the output of a prefix $q$ of any running program starts with some $x in B^*$ for the first time, set variable $V(x) := V(x) + 2^{-l(q)}$ (if no program has ever created output starting with $x$ then first create $V(x)$ initialized by 0); whenever the output of some extension $q'$ of $q$ (obtained by possibly reading additional input bits: $q'=q$ if none are read) lexicographically increases such that it does not equal $x$ any more, set $V(x) := V(x) - 2^{-l(q')}$.

(B) Simultaneously, starting at the right end of the unit interval $[0,1)$, as the $V(x)$ are being updated, keep updating a chain of disjoint, consecutive, adjacent, half-open (at the right end) intervals $IV(x) =
[LV(x), RV(x))$ of size $V(x) = RV(x) - LV(x)$ in alphabetic order on $x$, such that the right end of the $IV(x)$ of the largest $x$ coincides with the right end of $[0,1)$, and $IV(y)$ is to the right of $IV(x)$ if $y succ x$. After every variable update and each change of $s$, replace the output of $T$ by the $x$ of the $IV(x)$ with $0.s in IV(x)$.

This will never violate the EOM constraints: the c.e. $s$ cannot shrink, and since EOM outputs cannot decrease lexicographically, the interval boundaries $RV(x)$ and $LV(x)$ cannot grow (their negations are c.e., compare Lemma 4.1), hence $T$'s output cannot decrease.

For $x in B^*$ the $IV(x)$ converge towards an interval $I(x)$ of size $P^E(x)$. For $x in B^{infty}$ with $P^E(x) > 0$, we have: for any $epsilon > 0$ there is a time $t_0$ such that for all time steps $t>t_0$ in $T$'s computation, an interval $I_{epsilon}(x)$ of size $P^E(x) -
epsilon$ will be completely covered by certain $IV(y)$ satisfying $x
succ y$ and $0.x - 0.y < epsilon$. So for $epsilon rightarrow 0$ the $I_{epsilon}(x)$ also converge towards an interval $I(x)$ of size $P^E(x)$. Hence $T$ will output larger and larger $y$ approximating $x$ from below, provided $0.s in I(x)$.

Since any interval of size $c$ within $[0,1)$ contains a number $0.z$ with $l(z)$ = $-lg c$, in both cases there is a number $0.s$ (encodable by some $r$ satisfying $r leq l(s) + K_T(l(s)) + O(1))$) with $l(s)$ = $-lg P^E(x) + O(1)$, such that $T(r) leadsto x$, and therefore $K_T(x)
leq l(s) + K_T(l(s)) + O(1)$.

Less symmetric statements can also be derived in very similar fashion:

Theorem 5.4   Let TM $T$ induce approximable $CP_T(x)$ for all $x in B^*$ (compare Defs. 4.10 and 4.12; an EOM would be a special case). Then for $x in B^{sharp}$, $P_T(x) > 0$:
K^G(x) \leq KP_T(x) + K^G(KP_T(x)) + O(1).
\end{displaymath} (38)

Proof. Modify the proof of Theorem 5.3 for approximable as opposed to c.e. interval boundaries and approximable $0.s$.

A similar proof, but without the complication for the case $x in B^{infty}$, yields:

Theorem 5.5   Let $mu$ denote an approximable semimeasure on $x in B^*$; that is, $mu(x)$ is describable. Then for $mu(x) > 0$:
Km^G(x) \leq K\mu(x) + Km^G(K\mu(x)) + O(1);
\end{displaymath} (39)

K^G(x) \leq K\bar{\mu}(x) + K^G(K\bar{\mu}(x)) + O(1).
\end{displaymath} (40)

As a consequence,
\leq O(2^{-Km^G(x)});...
\leq O(2^{-K^G(x)}).
\end{displaymath} (41)

Proof. Initialize variables $V_{lambda} := 1$ and $IV_{lambda} := [0,1)$. Dovetailing over all $x succ lambda$, approximate the GTM-computable $bar{mu}(x) = mu(x)-mu(x0) -mu(x1)$ in variables $V_x$ initialized by zero, and create a chain of adjacent intervals $IV_x$ analogously to the proof of Theorem 5.3.

The $IV_x$ converge against intervals $I_x$ of size $bar{mu}(x)$. Hence $x$ is GTM-encodable by any program $r$ producing an output $s$ with $0.s in I_x$: after every update, replace the GTM's output by the $x$ of the $IV_x$ with $0.s in IV_x$. Similarly, if $0.s$ is in the union of adjacent intervals $I_y$ of strings $y$ starting with $x$, then the GTM's output will converge towards some string starting with $x$. The rest follows in a way similar to the one described in the final paragraph of the proof of Theorem 5.3.

Using the basic ideas in the proofs of Theorem 5.3 and 5.5 in conjunction with Corollary 4.3 and Lemma 4.2, one can also obtain statements such as:

Theorem 5.6   Let $mu_0$ denote the universal CEM from Theorem 4.1. For $x in B^*$,
K\mu_0(x) - O(1) \leq Km^E(x) \leq K\mu_0(x) + Km^E(K\mu_0(x)) + O(1)
\end{displaymath} (42)

While $P^E$ dominates $P^M$ and $P^G$ dominates $P^E$, the reverse statements are not true. In fact, given the results from Sections 3.2 and 5, one can now make claims such as the following ones:

Corollary 5.1   The following functions are unbounded:


Proof. For the cases $mu^E$ and $P^E$, apply Theorems 5.2, 5.6 and the unboundedness of (10). For the case $P^G$, apply Theorems 3.3 and 5.3.

next up previous
Next: Tighter Bounds? Up: Probability vs Descriptive Complexity Previous: Probability vs Descriptive Complexity
Juergen Schmidhuber 2003-02-13