Two additional output units for were introduced for
controlling fovea *rotations* (around the fovea center),
one for each of the directions
`clockwise' and `counter-clockwise'.
Thus the number of 's input units
increased to 46. At a given time, a clockwise rotation
was computed by mapping (through a multiplication
operation) the
current activation of the first additional
output unit to a rotation angle between 0 and 50 degrees.
The counter-clockwise rotation
was computed by mapping
the current activation of the second additional
output unit to a rotation angle between -50 and 0 degrees.
The final rotation was the sum of both rotations.
The same initialization conditions and learning
rates as with the translation experiments were employed.
As it was expected, the learning
of fovea trajectories which include rotations proved to
be more difficult than the learning of pure translation
sequences. 100000 training examples for and
20000 training trajectories for were employed.

Consider figures 6-9: In the beginning of some trajectory both the fovea and the test object (a triangle) were arbitrarily positioned and rotated in the pixel field. (However, the receptive fields of the input units partially overlapped the object.) The fovea rotation at each time step of some trajectory is indicated by the direction of an arrow. The task was to generate a fovea trajectory which lead the center of the fovea to a predefined point near the center of the triangle such that the arrow pointed towards the corner with the smallest angle.

*The experiments show that the learning of
successful fovea trajectories
involving translations and rotations is possible, although
ususally makes erroneous predictions. *
See [1]
for additional experiments.

It should be noted that we currently cannot answer general questions like: How many input units and how many hidden units are necessary for which kind of visual scenes? What are the optimal learning rates?

Back to Reinforcement Learning page

Back to main page on Learning attentive vision