The Optimal Ordered Problem Solver OOPS [38,40] (used by BIOPS in Section 2.3) is a bias-optimal (see Def. 2.1) way of searching for a program that solves each problem in an ordered sequence of problems of a reasonably general type, continually organizing and managing and reusing earlier acquired knowledge. Solomonoff recently also proposed related ideas for a scientist's assistant  that modifies the probability distribution of universal search  based on experience.
As pointed out earlier  (section on OOPS limitations), however, OOPS-like methods are not directly applicable to general lifelong reinforcement learning tasks such as those for which AIXI  was designed. But it is possible to use two OOPS-modules as components of a rather general reinforcement learner (OOPS-RL), one module learning a predictive model of the environment, the other one using this world model to search for an action sequence maximizing expected reward [38,42]. Despite the bias-optimality properties of OOPS for certain ordered task sequences, however, OOPS-RL is not necessarily the best way of spending limited time in general reinforcement learning situations , such as the ones where the Gödel machine is optimal in the sense of its utility function.