next up previous
Next: OBJECTIVE FUNCTION Up: TWO SUBGOAL CREATING ARCHITECTURES Previous: ARCHITECTURE 1

ARCHITECTURE 2

Figure 3 shows a recurrent subgoal generator $S$ (a back-prop net that feeds its output back to part of its input).

With problem $p$, the input vector of $S$ at the first `time step' of the sequential subgoal generation process is $s^p \circ g^p$. The output of $S$ is $s^p(1)$.

At time step $t, 1 < t < n+1$, the input of $S$ is $s^p(t-1) \circ g^p$. Its output is $s^p(t)$.

Again we use $E$ to compute $eval(s^p(k-1), s^p(k)), k = 1, \ldots n+1$, from $s^p(k-1) \circ s^p(k)$.

Figure 3: A recurrent subgoal generator emitting an arbitrary number of subgoals in response to a start/goal combination. Each subgoal is fed back to the START-input of the subgoal generator. The dashed line indicates that the evaluator needs to see the GOAL at the last step of the subgoal generation process. See text for details. Check out Schmidhuber's Habilitation thesis for pictures.



Juergen Schmidhuber 2003-03-14

Back to Subgoal learning - Hierarchical Learning
Pages with Subgoal learning pictures