next up previous
Next: Speed Prior-Based Predictions for Up: The New AI: General Previous: Super Omegas and Generalizations


Computable Predictions through the Speed Prior Based on the Fastest Way of Describing Objects

Unfortunately, while $M$ and the more general priors of Section 4 are computable in the limit, they are not recursive, and thus practically infeasible. This drawback inspired less general yet practically more feasible principles of minimum description length (MDL) [68,41] as well as priors derived from time-bounded restrictions [31] of Kolmogorov complexity [28,59,8]. No particular instance of these approaches, however, is universally accepted or has a general convincing motivation that carries beyond rather specialized application scenarios. For instance, typical efficient MDL approaches require the specification of a class of computable models of the data, say, certain types of neural networks, plus some computable loss function expressing the coding costs of the data relative to the model. This provokes numerous ad-hoc choices.

Our recent work [55], however, offers an alternative to the celebrated but noncomputable algorithmic simplicity measure or Solomonoff-Levin measure discussed above [59,75,60]. We introduced a new measure (a prior on the computable objects) which is not based on the shortest but on the fastest way of describing objects.

Let us assume that the observed data sequence is generated by a computational process, and that any possible sequence of observations is therefore computable in the limit [50]. This assumption is stronger and more radical than the traditional one: Solomonoff just insists that the probability of any sequence prefix is recursively computable, but the (infinite) sequence itself may still be generated probabilistically.

Given our starting assumption that data are deterministically generated by a machine, it seems plausible that the machine suffers from a computational resource problem. Since some things are much harder to compute than others, the resource-oriented point of view suggests the following postulate.

Postulate 1   The cumulative prior probability measure of all $x$ incomputable within time $t$ by any method is at most inversely proportional to $t$.

This postulate leads to the Speed Prior $S(x)$, the probability that the output of the following probabilistic algorithm starts with $x$ [55]:
1. Toss an unbiased coin until heads is up; let $i$ denote the number of required trials; set $t:=2^i$.

2. If the number of steps executed so far exceeds $t$ then exit. Execute one step; if this leads to a request for a new input bit (of the growing self-delimiting program, e.g., [30,31]), toss the coin to determine the bit, and set $t:=t/2$.

3. Go to 2.

Algorithm GUESS is very similar to a probabilistic search algorithm used in previous work on applied inductive inference [47,49]. On several toy problems it generalized extremely well in a way unmatchable by traditional neural network learning algorithms.

With $S$ comes a computable method AS for predicting optimally within $\epsilon$ accuracy [55]. Consider a finite but unknown program $p$ computing $y \in B^{\infty}$. What if Postulate 1 holds but $p$ is not optimally efficient, and/or computed on a computer that differs from our reference machine? Then we effectively do not sample beginnings $y_k$ from $S$ but from an alternative semimeasure $S'$. Can we still predict well? Yes, because the Speed Prior $S$ dominates $S'$. This dominance is all we need to apply the recent loss bounds [21]. The loss that we are expected to receive by predicting according to AS instead of using the true but unknown $S'$ does not exceed the optimal loss by much [55].


next up previous
Next: Speed Prior-Based Predictions for Up: The New AI: General Previous: Super Omegas and Generalizations
Juergen Schmidhuber 2003-02-04