Some of the novel results herein may be of interest to theoretical computer scientists and mathematicians (Sections 2-6), some to researchers in the fields of machine learning and inductive inference (the science of making predictions based on observations, e.g., 6-7), some to physicists (e.g., 6-8), some to philosophers (e.g., 7-8). Sections 7-8 might help those usually uninterested in technical details to decide whether they would also like to delve into the more formal Sections 2-6. In what follows, we summarize the main contributions and provide pointers to the most important theorems.
Section 2 introduces universal Turing Machines (TMs) more general than those considered in previous related work: unlike traditional TMs, General TMs or GTMs may edit their previous outputs (compare inductive TMs [#!Burgin:83!#]), and Enumerable Output Machines (EOMs) may do this provided the output does not decrease lexicographically. We will define: a formally describable object x has a finite, never halting GTM program that computes x such that each output bit is revised at most finitely many times; that is, each finite prefix of x eventually stabilizes (Defs. 2.1-2.5); describable functions can be implemented by such programs (Def. 2.10); weakly decidable problems have solutions computable by never halting programs whose output is wrong for at most finitely many steps (Def. 2.11). Theorem 2.1 generalizes the halting problem by demonstrating that it is not weakly decidable whether a finite string is a description of a describable object (compare a related result for analytic TMs by Hotz, Vierke and Schieffer [#!Hotz:95!#]).
Section 3 generalizes the traditional concept of Kolmogorov complexity or algorithmic information [#!Kolmogorov:65!#,#!Solomonoff:64!#,#!Chaitin:69!#] of finite x (the length of the shortest halting program computing x) to the case of objects describable by nonhalting programs on EOMs and GTMs (Defs. 3.2-3.4). It is shown that the generalization for EOMs is describable, but the one for GTMs is not (Theorem 3.1). Certain objects are much more compactly encodable on EOMs than on traditional monotone TMs, and Theorem 3.3 shows that there are also objects with short GTM descriptions yet incompressible on EOMs and therefore ``more random'' than Chaitin's [#!Chaitin:87!#], the halting probability of a TM with random input, which is incompressible only on monotone TMs. This yields a natural TM type-specific complexity hierarchy expressed by Inequality (14).
Section 4 discusses probability distributions on describable objects as well as the nondescribable convergence probability of a GTM (Def. 4.14). It also introduces describable (semi)measures as well as cumulatively enumerable measures (CEMs, Def. 4.5), where the cumulative probability of all strings lexicographically greater than a given string x is EOM-computable or enumerable. Theorem 4.1 shows that there is a universal CEM that dominates all other CEMs, in the sense that it assigns higher probability to any finite y, save for a constant factor independent of y. This probability is shown to be equivalent to the probability that an EOM whose input bits are chosen randomly produces an output starting with y (Corollary 4.3 and Lemma 4.2). The nonenumerable universal CEM also dominates enumerable priors studied in previous work by Solomonoff, Levin and others [#!Solomonoff:64!#,#!Zvonkin:70!#,#!Levin:74!#,#!Gacs:74!#,#!Chaitin:75!#,#!Gacs:83!#,#!Schnorr:73!#,#!Solomonoff:78!#,#!Chaitin:87!#,#!LiVitanyi:97!#]. Theorem 4.2 shows that there is no universal approximable measure (proof by M. Hutter).
Section 5 establishes relationships between generalized Kolmogorov complexity and generalized algorithmic probability, extending previous work on enumerable semimeasures by Levin, Gács, and others [#!Zvonkin:70!#,#!Levin:74!#,#!Gacs:74!#,#!Chaitin:75!#,#!Gacs:83!#,#!LiVitanyi:97!#]. For instance, Theorem 5.3 shows that the universal CEM assigns a probability to each enumerable object proportional to raised to the power of the length of its minimal EOM-based description, times a small corrective factor. Similarly, objects with approximable probabilities yet without very short descriptions on GTMs are necessarily very unlikely a priori (Theorems 5.4 and 5.5). Additional suspected links between generalized Kolmogorov complexity and probability are expressed in form of Conjectures 5.1-5.3.
Section 6 addresses issues of temporal complexity ignored in the previous sections on describable universe histories (whose computation may require excessive time without recursive bounds). In Subsection 6.2, Levin's universal search algorithm [#!Levin:73!#,#!Levin:84!#] (which takes into account program runtime in an optimal fashion) is modified to obtain the fastest way of computing all ``S-describable'' universes computable within countable time (Def. 6.1, Section 6.3); uncountably many other universes are ignored because they do not even exist from a constructive point of view. Postulate 6.1 then introduces a natural resource-oriented bias reflecting constraints of whoever calculated our universe (possibly as a by-product of a search for something else): we assign to universes prior probabilities inversely proportional to the time and space resources consumed by the most efficient way of computing them. Given the resulting ``speed prior S'' (Def. 6.5) and past observations x, Theorem 6.1 and Corollary 6.1 demonstrate that the best way of predicting a future y is to minimize the Levin complexity of (x,y).
Section 7 puts into perspective the algorithmic priors (recursive and enumerable) introduced in previous work on inductive inference by Solomonoff and others [#!Solomonoff:64!#,#!Solomonoff:78!#,#!LiVitanyi:97!#,#!Hutter:99!#], as well as the novel priors discussed in the present paper (cumulatively enumerable, approximable, resource-optimal). Collectively they yield an entire spectrum of algorithmic TOEs. We evaluate the plausibility of each prior being the one from which our own universe is sampled, discuss its connection to ``Occam's razor'' as well as certain physical and philosophical consequences, argue that the resource-optimal speed prior S may be the most plausible one (Section 7.4), analyze the inference problem from the point of view of an observer [#!Boskovich:1755!#,#!Boskovich:1966!#,#!Toffoli:79!#,#!Zurek:91!#,#!Svozil:94!#,#!Roessler:98!#] evolving in a universe sampled from S, make appropriate predictions for our own universe (Section 7.5), and discuss their falsifiability.