next up previous
Next: About this document ... Up: PLANNING SIMPLE TRAJECTORIES USING Previous: ACKNOWLEDGEMENTS

Bibliography

Anderson, 1986
Anderson, C. W. (1986).
Learning and Problem Solving with Multilayer Connectionist Systems.
PhD thesis, University of Massachusetts, Dept. of Comp. and Inf. Sci.

Barto, 1989
Barto, A. G. (1989).
Connectionist approaches for control.
Technical Report COINS 89-89, University of Massachusetts, Amherst MA 01003.

Barto et al., 1983
Barto, A. G., Sutton, R. S., and Anderson, C. W. (1983).
Neuronlike adaptive elements that can solve difficult learning control problems.
IEEE Transactions on Systems, Man, and Cybernetics, SMC-13:834-846.

Jameson, 1991
Jameson, J. (1991).
Delayed reinforcement learning with multiple time scale hierarchical backpropagated adaptive critics.
In Neural Networks for Control.

LeCun, 1985
LeCun, Y. (1985).
Une procédure d'apprentissage pour réseau à seuil asymétrique.
Proceedings of Cognitiva 85, Paris, pages 599-604.

Lin, 1991
Lin, L. (1991).
Self-improving reactive agents: Case studies of reinforcement learning frameworks.
In Meyer, J. A. and Wilson, S. W., editors, Proc. of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 297-305. MIT Press/Bradford Books.

Parker, 1985
Parker, D. B. (1985).
Learning-logic.
Technical Report TR-47, Center for Comp. Research in Economics and Management Sci., MIT.

Ring, 1991
Ring, M. B. (1991).
Incremental development of complex behaviors through automatic construction of sensory-motor hierarchies.
In Birnbaum, L. and Collins, G., editors, Machine Learning: Proceedings of the Eighth International Workshop, pages 343-347. Morgan Kaufmann.

Robinson and Fallside, 1987
Robinson, A. J. and Fallside, F. (1987).
The utility driven dynamic error propagation network.
Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department.

Rumelhart et al., 1986
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986).
Learning internal representations by error propagation.
In Rumelhart, D. E. and McClelland, J. L., editors, Parallel Distributed Processing, volume 1, pages 318-362. MIT Press.

Schmidhuber, 1991a
Schmidhuber, J. (1991a).
Learning to generate sub-goals for action sequences.
In Kohonen, T., Mäkisara, K., Simula, O., and Kangas, J., editors, Artificial Neural Networks, pages 967-972. Elsevier Science Publishers B.V., North-Holland.

Schmidhuber, 1991b
Schmidhuber, J. (1991b).
Reinforcement learning in Markovian and non-Markovian environments.
In Lippman, D. S., Moody, J. E., and Touretzky, D. S., editors, Advances in Neural Information Processing Systems 3, pages 500-506. Morgan Kaufmann.

Schmidhuber, 1992
Schmidhuber, J. (1992).
A fixed size storage $O(n^3)$ time complexity learning algorithm for fully recurrent continually running networks.
Neural Computation, 4(2):243-248.

Singh, 1992
Singh, S. (1992).
The efficient learning of multiple task sequences.
In Moody, J., Hanson, S., and Lippman, R., editors, Advances in Neural Information Processing Systems 4, pages 251-258, San Mateo, CA. Morgan Kaufmann.

Sutton, 1984
Sutton, R. S. (1984).
Temporal Credit Assignment in Reinforcement Learning.
PhD thesis, University of Massachusetts, Dept. of Comp. and Inf. Sci.

Watkins, 1989
Watkins, C. (1989).
Learning from Delayed Rewards.
PhD thesis, King's College, Oxford.

Werbos, 1974
Werbos, P. J. (1974).
Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences.
PhD thesis, Harvard University.

Williams, 1988
Williams, R. J. (1988).
Toward a theory of reinforcement-learning connectionist systems.
Technical Report NU-CCS-88-3, College of Comp. Sci., Northeastern University, Boston, MA.

Williams, 1989
Williams, R. J. (1989).
Complexity of exact gradient computation algorithms for recurrent neural networks.
Technical Report Technical Report NU-CCS-89-27, Boston: Northeastern University, College of Computer Science.

Williams and Zipser, 1994
Williams, R. J. and Zipser, D. (1994).
Gradient-based learning algorithms for recurrent networks and their computational complexity.
In Back-propagation: Theory, Architectures and Applications. Hillsdale, NJ: Erlbaum.



Juergen Schmidhuber 2003-03-14

Back to Subgoal learning - Hierarchical Learning
Pages with Subgoal learning pictures