Two additional output units for were introduced for controlling fovea rotations (around the fovea center), one for each of the directions `clockwise' and `counter-clockwise'. Thus the number of 's input units increased to 46. At a given time, a clockwise rotation was computed by mapping (through a multiplication operation) the current activation of the first additional output unit to a rotation angle between 0 and 50 degrees. The counter-clockwise rotation was computed by mapping the current activation of the second additional output unit to a rotation angle between -50 and 0 degrees. The final rotation was the sum of both rotations. The same initialization conditions and learning rates as with the translation experiments were employed. As it was expected, the learning of fovea trajectories which include rotations proved to be more difficult than the learning of pure translation sequences. 100000 training examples for and 20000 training trajectories for were employed.
Consider figures 6-9: In the beginning of some trajectory both the fovea and the test object (a triangle) were arbitrarily positioned and rotated in the pixel field. (However, the receptive fields of the input units partially overlapped the object.) The fovea rotation at each time step of some trajectory is indicated by the direction of an arrow. The task was to generate a fovea trajectory which lead the center of the fovea to a predefined point near the center of the triangle such that the arrow pointed towards the corner with the smallest angle.
The experiments show that the learning of successful fovea trajectories involving translations and rotations is possible, although ususally makes erroneous predictions. See  for additional experiments.
It should be noted that we currently cannot answer general questions like: How many input units and how many hidden units are necessary for which kind of visual scenes? What are the optimal learning rates?