Next: SSA Calls
Up: Appendix
Previous: Appendix
Basic Cycle of Operations
Until unknown time
(system death), the system repeats the
following basic instruction cycle over and over.
- 1.
- Select instruction head
with probability
,
where
Here the collective decision function
maps
real-valued
to real values. Given an appropriate
, each
module may ``veto'' instructions suggested by the other module.
Only instructions that are strongly supported by both modules are
highly likely to be selected. One possibility is
. In the experiments I use
.
Comment: owing to pecularities of certain instructions to be
introduced below,
will later be refined for cases where
addresses an instruction head as opposed to an argument.
- 2.
's
arguments
are
selected according to probability distributions
(except when
Bet! -- two of Bet!'s arguments will be treated
differently -- see Section A.3.4 below).
- 3.
- Execute the selected instruction. This will consume time and may
change (1) environment
, (2) IP, (3) internal state
; (4a)
, (4b)
. If there is
external reward
then set
(rewards
become visible to the system in the form of inputs).
- 4.
- If an input has changed one of the cell contents

,



, then shift the contents of

, 
,
,
to components

, 
,
,
,
respectively. This results in a built-in short-term memory
(long-term memory can be implemented by the system itself by
executing appropriate instruction sequences).
- 5.
- If
did not modify IP (no conditional jump -- compare
instruction list below), then compute the address of the next
instruction head by setting IP
.
Here
, where
is selected according to probability distribution
, while
is selected according
to
.
- 6.
- Goto 1.
Next: SSA Calls
Up: Appendix
Previous: Appendix
Juergen Schmidhuber
2003-03-10
Back to Active Learning - Exploration - Curiosity page