Juergen Schmidhuber's publications by topic

JÜRGEN SCHMIDHUBER'S PUBLICATIONS BY TOPIC (last update 1998)

Most of the files below are gzipped postscripts. Decompress them with "gunzip".

Alternatively choose publications by type and date.

Note: some papers address several issues - that's why they are repeatedly listed under different topics.

Please do not hesitate to contact juergen@idsia.ch in case this page contains statements that you consider false or misleading.


LOW-COMPLEXITY NEURAL NETWORKS, GENERALIZATION, OCCAM's RAZOR

Many of our machine learning algorithms, in one way or another, discover and exploit initially unknown environmental regularities. Regularity implies algorithmic compressibility - inductive learning and generalization are closely related to data compression. For instance, a "minimum description length"-based argument shows that flat minima of typical neural network error functions correspond to low expected overfitting/high generalization. In applications to stock market prediction, flat minimum search [2-3, 5] (with Sepp Hochreiter) outperforms other widely used competitors.

In related but perhaps even more ambitious work a derivate of Levin's universal search algorithm is used to discover neural nets with low Levin complexity, low Kolmogorov complexity, and high generalization capability [1,4]. At least with certain toy problems where it is computationally feasible, the method can lead to generalization results unmatchable by traditional neural net algorithms.

5.
S. Hochreiter and J. Schmidhuber. Feature extraction through LOCOCODE. Neural Computation 11(3): 679-714, 1999 (28 pages, 20 figures, 703 K, 4.9 M gunzipped).

4.
J. Schmidhuber. Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 10(5):857-873, 1997 (123 K).

3.
S. Hochreiter and J. Schmidhuber. Flat Minima. Neural Computation, 9(1):1-42, 1997, (201 K).

2.
S.  Hochreiter and J.  Schmidhuber. Simplifying neural nets by discovering flat minima. In G. Tesauro, D. S. Touretzky and T. K. Leen, eds., Advances in Neural Information Processing Systems 7, pages 529-536. MIT Press, Cambridge MA, 1995.

1.
J.  Schmidhuber. Discovering solutions with low Kolmogorov complexity and high generalization capability. In A. Prieditis and S. Russell, editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 488-496. Morgan Kaufmann Publishers, San Francisco, CA, 1995.


SUPERVISED RECURRENT NEURAL NETWORKS

We introduced various novel learning algorithms for recurrent neural nets with time-varying inputs. The most remarkable achievement so far is Long Short-Term Memory (with Sepp Hochreiter, 1995 -), an algorithm without many drawbacks of previous approaches: LSTM can learn to bridge very long time lags by enforcing constant error flow back through time. In experimental comparisons with competing approaches, LSTM leads to many more successful runs, and learns much faster.

Earlier work on "History Compression" exploited regularities in partly predictable symbol strings to accelerate sequence classification with recurrent nets. An implementation based on hierarchical recurrent nets easily outperforms previous approaches when it comes to learning regular grammars from symbol sequences with long time lags between occurrences of relevant symbols (but there have to be local regularities).

Several papers on other recurrent net aspects are included below.

16.
S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997 (170 K).

15.
S. Hochreiter and J. Schmidhuber. LSTM can solve hard long time lag problems. In M. C. Mozer, M. I. Jordan, T. Petsche, eds., Advances in Neural Information Processing Systems 9, pages 473-479, MIT Press, Cambridge MA, 1997.

14.
S. Hochreiter and J. Schmidhuber. Bridging long time lags by weight guessing and "Long Short-Term Memory". In F. L. Silva, J. C. Principe, L. B. Almeida, eds., Frontiers in Artificial Intelligence and Applications, Volume 37, pages 65-72, IOS Press, Amsterdam, Netherlands, 1996.

13.
J. Schmidhuber and S. Hochreiter. Guessing can outperform many long time lag algorithms. Technical Note IDSIA-19-96, IDSIA, May 1996.

12.
J.  Schmidhuber. A self-referential weight matrix. In Proceedings of the International Conference on Artificial Neural Networks, Amsterdam, pages 446-451. Springer, 1993.

11.
J.  Schmidhuber. On decreasing the ratio between learning complexity and number of time-varying variables in fully recurrent nets. In Proceedings of the International Conference on Artificial Neural Networks, Amsterdam, pages 460-463. Springer, 1993.

10.
J.  Schmidhuber. Netzwerkarchitekturen, Zielfunktionen und Kettenregel. (Net architectures, objective functions, and chain rule.) Habilitationsschrift (postdoctoral thesis), Institut für Informatik, Technische Universität München, 1993. Part I (196 K), Part II (148 K), Part III (213 K).

9.
J. Schmidhuber. Learning complex, extended sequences using the principle of history compression. Neural Computation, 4(2):234-242, 1992 (41 K).

8.
J.  Schmidhuber. Learning unambiguous reduced sequence descriptions. In J. E. Moody, S. J. Hanson, and R. P. Lippman, editors, Advances in Neural Information Processing Systems 4, pages 291-298. San Mateo, CA: Morgan Kaufmann, 1992.

7.
J. Schmidhuber. A fixed size storage O(n^3) time complexity learning algorithm for fully recurrent continually running networks. Neural Computation, 4(2):243-248, 1992 (33 K).

6.
J. Schmidhuber. Learning to control fast-weight memories: An alternative to recurrent nets. Neural Computation, 4(1):131-139, 1992 (39 K).

5.
J.  Schmidhuber. Learning temporary variable binding with dynamic links. In Proc. International Joint Conference on Neural Networks, Singapore, volume 3, pages 2075-2079. IEEE, 1991.

2.
J.  Schmidhuber. Learning algorithms for networks with internal and external feedback. In D. S. Touretzky, J. L. Elman, T. J. Sejnowski, and G. E. Hinton, editors, Proc. of the 1990 Connectionist Models Summer School, pages 52-61. San Mateo, CA: Morgan Kaufmann, 1990.

1.
J.  Schmidhuber. Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem. (Dynamic neural nets and the fundamental spatio-temporal credit assignment problem.) Dissertation, Institut für Informatik, Technische Universität München, 1990.


UNSUPERVISED NEURAL NETWORKS, REDUNDANCY REDUCTION, ICA

To my knowledge, "Predictability Minimization" (PM) [2,5,7,10,11,12,14,17] is the first non-linear neural algorithm for encoding input data consisting of non-linear mixtures of basic features by "factorial" codes with statistically independent components (ICA stands for "independent component analysis"). PM is a co-evolutionary, unsupervised learning algorithm based on neural feature detectors and predictors that fight each other (1991 - ). PM has various potential advantages over other neural methods for redundancy reduction. When applied to image data, PM automatically comes up with feature detectors reminiscent of those in biological systems (such as orientation sensitive edge detectors, on-center-off-surround detectors, bar detectors).

An alternative method called LOCOCODE (1995 - ) performs ICA as a by-product of discovering simple networks (with low information-theoretic complexity) coding the input data [15,16,19-22]. It can outperform previous methods for ICA and PCA, and establishes a link between regularization and unsupervised learning.

Automatic sequence compression methods [1,3,4,8] also can be classified as unsupervised coding approaches.

The "Neural Heat Exchanger" [13] (presented in talks since 1990) is a supervised variant of Hinton and Dayan's 1994 "Helmholtz machine".

22.
S. Hochreiter and J. Schmidhuber. Source separation as a by-product of regularization. To be presented at NIPS'98, 1998.

21.
S. Hochreiter and J. Schmidhuber. LOCOCODE performs nonlinear ICA without knowing the number of sources. To be presented at ICA'99, January 11-15, 1999.

20.
S. Hochreiter and J. Schmidhuber. Feature extraction through LOCOCODE. Neural Computation 11(3): 679-714, 1999 (28 pages, 20 figures, 703 K, 4.9 M gunzipped).

19.
S. Hochreiter and J. Schmidhuber. LOCOCODE versus PCA and ICA. In Proceedings of the International Conference on Artificial Neural Networks, Sweden, Springer, to appear 1998.

18.
J.  Schmidhuber. Neural predictors for detecting and removing redundant information. In H. Cruse, J. Dean, and H. Ritter, editors, Adaptive Behavior and Learning. Kluwer, 1998, in preparation.

17.
M. Eldracher, N. N. Schraudolph, and J. Schmidhuber, Processing Images by Semi-Linear Predictability Minimization. Technical Report IDSIA-77-97, 1997.

16.
S. Hochreiter and J. Schmidhuber. Low-complexity coding and decoding. In K. M. Wong, I. King, D. Yeung, eds., Theoretical Aspects of Neural Computation: a Multidisciplinary Perspective, pages 297-306, Springer, 1997.

15.
S. Hochreiter and J. Schmidhuber. Unsupervised coding with LOCOCODE. In W. Gerstner, A. Germond, M. Hasler, J.-D. Nicoud, eds., Proceedings of the International Conference on Artificial Neural Networks, Lausanne, Switzerland, Springer, 655-660, 1997.

14.
J. Schmidhuber and M. Eldracher and B. Foltin. Semilinear predictability minimzation produces well-known feature detectors. Neural Computation, 8(4):773-786, 1996 (260 K).

13.
J.  Schmidhuber. The Neural Heat Exchanger. In S. Amari, L. Xu, L. Chan, I. King, K. Leung, eds., Progress in Neural Information Processing: Proceedings of the Intl. Conference on Neural Information Processing, pages 194-197, Springer, Hongkong, 1996. Earlier presentations in talks at universities since 1990.

12.
J. Schmidhuber and B. Foltin. Semilinear predictability minimization produces orientation sensitive edge detectors. Technical Report FKI-201-94, Fakultät für Informatik, Technische Universität München, December 1994.

11.
J. Schmidhuber and D. Prelinger. Discovering predictable classifications. Neural Computation, 5(4):625-635, 1993 (51 K).

10.
J.  Schmidhuber and D. Prelinger. Unsupervised extraction of predictable abstract features. In Proceedings of the International Conference on Artificial Neural Networks, Amsterdam, pages 601-604. Springer, 1993.

9.
J.  Schmidhuber and D. Prelinger. A novel unsupervised classification method. In Proc. of the Intl. Conf. on Artificial Neural Networks, Brighton, pages 91-96. IEE, 1993.

8.
J.  Schmidhuber, M. C. Mozer, and D. Prelinger. Continuous history compression. In H. Hüning, S. Neuhauser, M. Raus, and W. Ritschel, editors, Proc. of Intl. Workshop on Neural Networks, RWTH Aachen, pages 87-95. Augustinus, 1993.

7.
J. Schmidhuber. Learning factorial codes by predictability minimization. Neural Computation, 4(6):863-879, 1992 (53 K).

6.
J. Schmidhuber. Learning complex, extended sequences using the principle of history compression. Neural Computation, 4(2):234-242, 1992 (41 K).

5.
J.  Schmidhuber and D. Prelinger. Discovering predictable classifications. Technical Report CU-CS-626-92, Dept. of Comp. Sci., University of Colorado at Boulder, November 1992.

4.
J.  Schmidhuber. Learning unambiguous reduced sequence descriptions. In J. E. Moody, S. J. Hanson, and R. P. Lippman, editors, Advances in Neural Information Processing Systems 4, pages 291-298. San Mateo, CA: Morgan Kaufmann, 1992.

3.
J.  Schmidhuber. Adaptive decomposition of time. In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Artificial Neural Networks, pages 909-914. Elsevier Science Publishers B.V., North-Holland, 1991.

2.
J.  Schmidhuber. Learning factorial codes by predictability minimization. Technical Report CU-CS-565-91, Dept. of Comp. Sci., University of Colorado at Boulder, December 1991.

1.
J.  Schmidhuber. Neural sequence chunkers. Technical Report FKI-148-91, Institut für Informatik, Technische Universität München, April 1991.

SEQUENCE COMPRESSION

An important special case of redundancy reduction (compare section on unsupervised learning and ICA).

8.
J.  Schmidhuber and S.  Heil. Compressing texts with neural nets. In Dale, Moisl and Somers, eds., Handbook of Natural Language Processing, Marcel Dekker, Inc., to appear 1998.

7.
J. Schmidhuber and S. Heil. Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142-146, 1996 (68 K).

6.
J.  Schmidhuber and S.  Heil. Predictive coding with neural nets: Application to text compression. In G. Tesauro, D. S. Touretzky and T. K. Leen, eds., Advances in Neural Information Processing Systems 7, pages 1047-1054. MIT Press, Cambridge MA, 1995.

5.
J.  Schmidhuber, M. C. Mozer, and D. Prelinger. Continuous history compression. In H. Hüning, S.  Neuhauser, M. Raus, and W. Ritschel, editors, Proc. of Intl. Workshop on Neural Networks, RWTH Aachen, pages 87-95. Augustinus, 1993.

4.
J. Schmidhuber. Learning complex, extended sequences using the principle of history compression. Neural Computation, 4(2):234-242, 1992 (41 K).

3.
J.  Schmidhuber. Learning unambiguous reduced sequence descriptions. In J. E. Moody, S. J. Hanson, and R. P. Lippman, editors, Advances in Neural Information Processing Systems 4, pages 291-298. San Mateo, CA: Morgan Kaufmann, 1992.

2.
J.  Schmidhuber. Adaptive history compression for learning to divide and conquer. In Proc. International Joint Conference on Neural Networks, Singapore, volume 2, pages 1130-1135. IEEE, 1991.

1.
J.  Schmidhuber. Adaptive decomposition of time. In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Artificial Neural Networks, pages 909-914. Elsevier Science Publishers B.V., North-Holland, 1991.

0.
J.  Schmidhuber. Neural sequence chunkers. Technical Report FKI-148-91, Institut für Informatik, Technische Universität München, April 1991.


STOCK MARKET PREDICTION

Our most lucrative application.

2.
S. Hochreiter and J. Schmidhuber. Flat Minima. Neural Computation, 9(1):1-42, 1997, (201 K).

1.
S.  Hochreiter and J.  Schmidhuber. Simplifying neural nets by discovering flat minima. In G. Tesauro, D. S. Touretzky and T. K. Leen, eds., Advances in Neural Information Processing Systems 7, pages 529-536. MIT Press, Cambridge MA, 1995.


REINFORCEMENT LEARNING IN FULLY OBSERVABLE WORLDS

Most work in mainstream reinforcement learning assumes that the learner's current input tells it everything about the environmental state. This is often unrealistic but makes things much easier.

4.
M. Wiering and J. Schmidhuber. Fast online Q(lambda). Machine Learning, accepted 1998 (80 K).

3.
M. Wiering and J. Schmidhuber. Efficient model-based exploration. In From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior, 1998, in press.

2.
J. Storck, S. Hochreiter, and J.  Schmidhuber. Reinforcement-driven information acquisition in non-deterministic environments. In Proc. ICANN'95, vol. 2, pages 159-164. EC2 & CIE, Paris, 1995.

1.
J.  Schmidhuber. Curious model-building control systems. In Proc. International Joint Conference on Neural Networks, Singapore, volume 2, pages 1458-1463. IEEE, 1991.


REINFORCEMENT LEARNING IN PARTIALLY OBSERVABLE WORLDS

Many of our learning agents have an internal state that they can use to memorize important events. The question is: how can they learn to identify and store those events relevant for further optimal action selection? To address this issue we have studied reinforcement learners with (a) recurrent neural net value function approximators (1990 -), (b) recurrent neural net world models (1990 -), (c) actions that address and set internal storage cells, trained by the success-story algorithm (1994 -), (d) direct search in a space of event-memorizing algorithms, (e) other things.

21.
M. Wiering and J. Schmidhuber. HQ-Learning. Adaptive Behavior 6(2):219-246, 1998 (122 K).

20.
R. Salustowicz and M. Wiering and J. Schmidhuber. Learning team strategies: soccer case studies. Machine Learning, to appear 1998 (127 K).

19.
J. Schmidhuber, J. Zhao, and M. Wiering. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning 28:105-130, 1997.

18.
J.  Schmidhuber, J.  Zhao, N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, eds., Learning to learn, Kluwer, pages 293-309, 1997.

17.
R. Salustowicz and J. Schmidhuber. Probabilistic incremental program evolution. Evolutionary Computation, 5(2):123-141, 1997.

16.
M. Wiering and J. Schmidhuber. Solving POMDPs using Levin search and EIRA. In L. Saitta, ed., Machine Learning: Proceedings of the 13th International Conference, pages 534-542, Morgan Kaufmann Publishers, San Francisco, CA, 1996.

15.
M. Wiering and J. Schmidhuber. HQ-Learning: Discovering Markovian subgoals for non-Markovian reinforcement learning. Technical Report IDSIA-95-96, IDSIA, October 1996.

14.
J.  Schmidhuber and J.  Zhao and M.  Wiering. Simple principles of metalearning. Technical Report IDSIA-69-96, IDSIA, June 1996.

13.
J. Schmidhuber. On learning how to learn learning strategies. Technical Report FKI-198-94, Fakultät für Informatik, Technische Universität München, November 1994.

12.
J.  Schmidhuber. Reinforcement learning in Markovian and non-Markovian environments. In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 500-506. San Mateo, CA: Morgan Kaufmann, 1991.

11.
J. Schmidhuber and R. Huber. Learning to generate artificial fovea trajectories for target detection. International Journal of Neural Systems, 2(1 & 2):135-141, 1991 (50 K - figures omitted!).

10.
J.  Schmidhuber and R. Huber. Using sequential adaptive neuro-control for efficient learning of rotation and translation invariance. In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Artificial Neural Networks, pages 315-320. Elsevier Science Publishers B.V., North-Holland, 1991.

9.
J.  Schmidhuber. Learning algorithms for networks with internal and external feedback. In D. S. Touretzky, J. L. Elman, T. J. Sejnowski, and G. E. Hinton, editors, Proc. of the 1990 Connectionist Models Summer School, pages 52-61. San Mateo, CA: Morgan Kaufmann, 1990.

8.
J.  Schmidhuber. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments. In Proc. IEEE/INNS International Joint Conference on Neural Networks, San Diego, volume 2, pages 253-258, 1990.

7.
J.  Schmidhuber. Reinforcement learning with interacting continually running fully recurrent networks. In Proc. INNC International Neural Network Conference, Paris, volume 2, pages 817-820, 1990.

6.
J.  Schmidhuber. Temporal-difference-driven learning in recurrent networks. In R. Eckmiller, G. Hartmann, and G. Hauske, editors, Parallel Processing in Neural Systems and Computers, pages 209-212. North-Holland, 1990.

5.
J.  Schmidhuber. Reinforcement-Lernen und adaptive Steuerung. Nachrichten Neuronale Netze, 2:1-3, 1990.

4.
J.  Schmidhuber. Making the world differentiable: On using fully recurrent self-supervised neural networks for dynamic reinforcement learning and planning in non-stationary environments. Technical Report FKI-126-90, Institut für Informatik, Technische Universität München, February 1990 (revised in November).

3.
J.  Schmidhuber. Networks adjusting networks. In J. Kindermann and A. Linden, editors, Proceedings of `Distributed Adaptive Neural Information Processing', St.Augustin, 24.-25.5. 1989, pages 197-208. Oldenbourg, 1990. Extended version: TR FKI-125-90 (revised), Institut für Informatik, TUM.

2.
J.  Schmidhuber. Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem. (Dynamic neural nets and the fundamental spatio-temporal credit assignment problem.) Dissertation, Institut für Informatik, Technische Universität München, 1990.

1.
J. Schmidhuber. A local learning algorithm for dynamic feedforward and recurrent networks. Connection Science, 1(4):403-412, 1989. (The Neural Bucket Brigade, 43 K - figures omitted!).



EXPLORATION - WHAT'S INTERESTING?

We introduced the first active reinforcement learning methods translating mismatches between expectations and reality into reinforcement for "curious", exploring agents (1991 - ): our agents like to go where they expect to learn something. One recent focus is on exploring the space of general algorithms (as opposed to traditional simple grid-worlds).

10.
J. Schmidhuber. What's interesting? In Abstract Collection of SNOWBIRD: Machines That Learn. Utah, April 1998.

9.
M. Wiering and J. Schmidhuber. Efficient model-based exploration. In From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior, 1998, in press.

8.
M. Wiering and J. Schmidhuber. Learning exploration policies with models. To appear in Proc. CONALD, 1998.

7.
J. Schmidhuber. What's interesting? Technical Report IDSIA-35-97, IDSIA, July 1997 (23 pages, 10 figures, 157 K, 834 K gunzipped).

6.
J.  Schmidhuber, J.  Zhao, N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, eds., Learning to learn, Kluwer, pages 293-309, 1997.

5.
J. Storck, S. Hochreiter, and J.  Schmidhuber. Reinforcement-driven information acquisition in non-deterministic environments. In Proc. ICANN'95, vol. 2, pages 159-164. EC2 & CIE, Paris, 1995.

4.
J. Schmidhuber. On learning how to learn learning strategies. Technical Report FKI-198-94, Fakultät für Informatik, Technische Universität München, November 1994.

3.
J.  Schmidhuber. Curious model-building control systems. In Proc. International Joint Conference on Neural Networks, Singapore, volume 2, pages 1458-1463. IEEE, 1991.

2.
J.  Schmidhuber. Adaptive curiosity and adaptive confidence. Technical Report FKI-149-91, Institut für Informatik, Technische Universität München, April 1991.

1.
J.  Schmidhuber. A possibility for implementing curiosity and boredom in model-building neural controllers. In J. A. Meyer and S. W. Wilson, editors, Proc. of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 222-227. MIT Press/Bradford Books, 1991.

SUBGOAL DISCOVERY

There is no teacher providing useful intermediate subgoals for our reinforcement learning systems. Refs [1-4] use gradient-based subgoal generators, refs [5-6] search in discrete subgoal space.

6.
M. Wiering and J. Schmidhuber. HQ-Learning. Adaptive Behavior 6(2):219-246, 1998 (122 K).

5.
M. Wiering and J. Schmidhuber. HQ-Learning: Discovering Markovian subgoals for non-Markovian reinforcement learning. Technical Report IDSIA-95-96, IDSIA, October 1996.

4.
J.  Schmidhuber. Netzwerkarchitekturen, Zielfunktionen und Kettenregel. (Net architectures, objective functions, and chain rule.) Habilitationsschrift (postdoctoral thesis), Institut für Informatik, Technische Universität München, 1993. Part I (196 K), Part II (148 K), Part III (213 K).

3.
J.  Schmidhuber and R. Wahnsiedler. Planning simple trajectories using neural subgoal generators. In J. A. Meyer, H. L. Roitblat, and S. W. Wilson, editors, Proc. of the 2nd International Conference on Simulation of Adaptive Behavior, pages 196-202. MIT Press, 1992.

2.
J.  Schmidhuber. Learning to generate sub-goals for action sequences. In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Artificial Neural Networks, pages 967-972. Elsevier Science Publishers B.V., North-Holland, 1991.

1.
J.  Schmidhuber. Towards compositional learning with dynamic neural networks. Technical Report FKI-129-90, Institut für Informatik, Technische Universität München, 1990.

GENETIC PROGRAMMING

Genetic Programming (GP) uses Genetic Algorithms to evolve computer programs. GP was invented by Nichael Cramer in 1985. We were not aware of his work when in 1987 we described the first (to our knowledge) "modern" GP approach directly operating on variable-length code [1]. We applied our system to simple tasks including the "lawnmower problem", later also studied by Koza (1994), whose home page and books on GP unfortunately leave the impression that he was the first to invent GP.

Pages 7-13 of ref [2] are devoted to an extended GP approach that recursively applies metalevel GP to the task of finding better program-modifying programs on lower levels - the goal is to use GP for improving GP.

2.
J.  Schmidhuber. Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-... hook. Diploma thesis, Institut für Informatik, Technische Universität München, 1987.

1.
D. Dickmanns, J. Schmidhuber, and A. Winklhofer. Der genetische Algorithmus: Eine Implementierung in Prolog. Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl Prof. Radig, Technische Universität München, 1987.

PROGRAM EVOLUTION

There may be much better ways of evolving computer programs than GP's. Our contributions include Adaptive Levin Search (extending Levin's universal search algorithm, which is theoretically optimal for non-incremental search), and Probabilistic Incremental Program Evolution (PIPE).

10.
R.  Salustowicz and J.  Schmidhuber. Learning to predict through PIPE and automatic task decomposition. Technical Report IDSIA-11-98, IDSIA, April 1998.

9.
R. Salustowicz and M. Wiering and J. Schmidhuber. Learning team strategies: soccer case studies. Machine Learning, to appear 1998 (127 K).

8.
R.  Salustowicz and J.  Schmidhuber. Evolving structured programs with hierarchical instructions and skip nodes. Machine Learning: Proceedings of the 15th International Conference, Morgan Kaufmann Publishers, San Francisco, CA, to appear 1998.

7.
J. Schmidhuber, J. Zhao, and M. Wiering. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning 28:105-130, 1997.

6.
R. Salustowicz and J. Schmidhuber. Probabilistic incremental program evolution. Evolutionary Computation, 5(2):123-141, 1997.

5.
J. Schmidhuber. Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 10(5):857-873, 1997 (123 K).

4.
R.  Salustowicz and J.  Schmidhuber. Probabilistic incremental program evolution: stochastic search through program space. In van Someren, M., Widmer, G., editors, Machine Learning: ECML-97, Lecture Notes in Artificial Intelligence 1224, pages 213-220, Springer, 1997.

3.
M. Wiering and J. Schmidhuber. Solving POMDPs using Levin search and EIRA. In L. Saitta, ed., Machine Learning: Proceedings of the 13th International Conference, pages 534-542, Morgan Kaufmann Publishers, San Francisco, CA, 1996.

2.
J.  Schmidhuber. Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-... hook. Diploma thesis, Institut für Informatik, Technische Universität München, 1987.

1.
D. Dickmanns, J. Schmidhuber, and A. Winklhofer. Der genetische Algorithmus: Eine Implementierung in Prolog. Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl Prof. Radig, Technische Universität München, 1987.

METALEARNING (LEARNING TO LEARN)

Metalearning means learning the credit assignment method itself. Metalearning may be the most ambitious but also the most rewarding goal of machine learning. There are few limits to what a good metalearner will learn. Where appropriate it will learn to learn by analogy, by chunking, by planning, by subgoal generation, by combinations thereof - you name it.

Don't be misled by confusion in some of the recent literature: "learning to learn" is orthogonal to "inductive transfer" (learning from successive tasks).

Most of our approaches to metalearning are based on self-modifying policies (SMPs). The learning algorithm of an SMP is part of the SMP itself - SMPs can modify the way they modify themselves. We have introduced several ways of forcing SMPs to come up with better and better self-modification algorithms: (a) The "success-story algorithm" [6-14], (b) Gradient calculation in recurrent nets [2-5], (c) Market models of the mind inspired by Holland's bucket brigade [1], (d) Genetic Programming on recursive meta-levels [1].

14.
J.  Schmidhuber. A general method for incremental self-improvement and multiagent learning. In X. Yao, editor, Evolutionary Computation: Theory and Applications. Chapter 3, pp.81-123, Scientific Publ. Co., Singapore, 1999 (submitted 1996).

13.
J.  Schmidhuber, J.  Zhao, N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, eds., Learning to learn, Kluwer, pages 293-309, 1997.

12.
J. Schmidhuber, J. Zhao, and M. Wiering. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning 28:105-130, 1997.

11.
J.  Zhao and J.  Schmidhuber. Solving a complex prisoner's dilemma with self-modifying policies. In From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior, 1998, in press.

10.
J.  Schmidhuber and J.  Zhao and M.  Wiering. Simple principles of metalearning. Technical Report IDSIA-69-96, IDSIA, June 1996.

9.
M. Wiering and J. Schmidhuber. Solving POMDPs using Levin search and EIRA. In L. Saitta, ed., Machine Learning: Proceedings of the 13th International Conference, pages 534-542, Morgan Kaufmann Publishers, San Francisco, CA, 1996.

8.
J. Schmidhuber. Environment-independent reinforcement acceleration (invited talk at Hongkong University of Science and Technology). Technical Note IDSIA-59-95, June 1995.

7.
J.  Schmidhuber. Beyond "Genetic Programming": Incremental Self-Improvement. In J. Rosca, ed., Proc. Workshop on Genetic Programming at ML95, pages 42-49. National Resource Lab for the study of Brain and Behavior, 1995.

6.
J. Schmidhuber. On learning how to learn learning strategies. Technical Report FKI-198-94, Fakultät für Informatik, Technische Universität München, November 1994.

5.
J.  Schmidhuber. A neural network that embeds its own meta-levels. In Proc. of the International Conference on Neural Networks '93, San Francisco. IEEE, 1993.

4.
J.  Schmidhuber. An introspective network that can learn to run its own weight change algorithm. In Proc. of the Intl. Conf. on Artificial Neural Networks, Brighton, pages 191-195. IEE, 1993.

3.
J.  Schmidhuber. A self-referential weight matrix. In Proceedings of the International Conference on Artificial Neural Networks, Amsterdam, pages 446-451. Springer, 1993.

2.
J.  Schmidhuber. Steps towards `self-referential' learning. Technical Report CU-CS-627-92, Dept. of Comp. Sci., University of Colorado at Boulder, November 1992.

1.
J.  Schmidhuber. Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-... hook. Diploma thesis, Institut für Informatik, Technische Universität München, 1987.



MARKET MODELS FOR MACHINE LEARNING

Pages 23-51 of ref [1] are devoted to a reinforcement learning approach called "prototypical self-referential learning mechanisms" (PSALM 1 - PSALM 3). PSALMs use competing "metalearning" agents with actions for generating and connecting agents and for assigning credit to agents, subject to the constraint that total credit is conserved (except for external reward and consumption).

Ref [3] describes a related but less general "economy" of neurons inspired by Holland's bucket brigade. External reward pays incoming weights of currently active output units, active unit U's outgoing weights to other active units pay to U's incoming weights (money = weight substance). Competition stems from partitioning the set of units into winner-take-all subsets.

3.
J. Schmidhuber. A local learning algorithm for dynamic feedforward and recurrent networks. Connection Science, 1(4):403-412, 1989. (The Neural Bucket Brigade, 43 K - figures omitted!).

2.
J.  Schmidhuber. The neural bucket brigade. In R. Pfeifer, Z. Schreter, Z. Fogelman, and L. Steels, editors, Connectionism in Perspective, pages 439-446. Amsterdam: Elsevier, North-Holland, 1989.

1.
J.  Schmidhuber. Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-... hook. Diploma thesis, Institut für Informatik, Technische Universität München, 1987.

PROBABILISTIC PROGRAMMING LANGUAGES, SUCCESS-STORY ALGORITHM

My main focus has always been on letting the credit assignment strategy improve itself (metalearning - see above). Towards this end I have used "self-referential" probabilistic programs whose instructions may modify the underlying probability distribution. A backtracking scheme called the success-story algorithm occasionally undoes "bad" self-generated probability modifications and stabilizes "good" ones to make sure the system accelerates reward intake in the long run.

8.
J. Schmidhuber and J. Zhao. Direct policy search and uncertain policy evaluation. 1999 AAAI Spring Symposium on Search under Uncertain and Incomplete Information, Stanford Univ., 1999.

7.
J. Schmidhuber, J. Zhao, and M. Wiering. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning 28:105-130, 1997.

6.
J.  Schmidhuber. A general method for incremental self-improvement and multiagent learning. In X. Yao, editor, Evolutionary Computation: Theory and Applications. Chapter 3, pp.81-123, Scientific Publ. Co., Singapore, 1999 (submitted 1996).

5.
J.  Schmidhuber, J.  Zhao, N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, eds., Learning to learn, Kluwer, pages 293-309, 1997.

4.
J.  Schmidhuber and J.  Zhao. Multiagent learning with the success-story algorithm. In G. Weiss, ed., Distributed Artificial Intelligence Meets Machine Learning, pages 82-93, Springer, Berlin, 1997.

3.
J.  Zhao and J.  Schmidhuber. Incremental self-improvement for life-time multiagent reinforcement learning. In Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, eds., From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, pages 516-525, MIT Press, Bradford Books, Cambridge, MA, 1996.

2.
J.  Schmidhuber and J.  Zhao and M.  Wiering. Simple principles of metalearning. Technical Report IDSIA-69-96, IDSIA, June 1996.

1.
J. Schmidhuber. On learning how to learn learning strategies. Technical Report FKI-198-94, Fakultät für Informatik, Technische Universität München, November 1994.

MULTIAGENT LEARNING

Some of the approaches mentioned above are naturally applicable to multiagent learning.

12.
R. Salustowicz and M. Wiering and J. Schmidhuber. Learning team strategies: soccer case studies. Machine Learning, to appear 1998 (127 K).

11.
M. Wiering and J. Schmidhuber. HQ-Learning. Adaptive Behavior 6(2):219-246, 1998 (122 K).

10.
M. Wiering and J. Schmidhuber. CMAC Models Learn to Play Soccer. In Proceedings of the International Conference on Artificial Neural Networks, Sweden, Springer, to appear 1998.

9.
J.  Zhao and J.  Schmidhuber. Solving a complex prisoner's dilemma with self-modifying policies. In From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior, 1998, in press.

8.
J.  Schmidhuber. A general method for incremental self-improvement and multiagent learning in unrestricted environments. In X. Yao, editor, Evolutionary Computation: Theory and Applications. Chapter 3, pp.81-123, Scientific Publ. Co., Singapore, 1999 (submitted 1996).

7.
J.  Schmidhuber, J.  Zhao, N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, eds., Learning to learn, Kluwer, pages 293-309, 1997.

6.
J.  Schmidhuber and J.  Zhao. Multiagent learning with the success-story algorithm. In G. Weiss, ed., Distributed Artificial Intelligence Meets Machine Learning, pages 82-93, Springer, Berlin, 1997.

5.
R. Salustowicz and M. Wiering and J. Schmidhuber. On learning soccer strategies. In W. Gerstner, A. Germond, M. Hasler, J.-D. Nicoud, eds., Proceedings of the International Conference on Artificial Neural Networks, Lausanne, Switzerland, Springer, 769-774, 1997.

4.
J.  Zhao and J.  Schmidhuber. Incremental self-improvement for life-time multiagent reinforcement learning. In Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, eds., From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, pages 516-525, MIT Press, Bradford Books, Cambridge, MA, 1996.

3.
J.  Schmidhuber. Realistic multiagent reinforcement learning. In G. Weiss, ed., Learning in Distributed Artificial Intelligence Systems. Working Notes of the 1996 ECAI Workshop, 1996.

2.
J.  Schmidhuber. A general method for multiagent learning in unrestricted environments. In 1996 AAAI Syposium on Adaptation, Co-evolution and Learning in Multiagent Systems, TR SS-96-01, pages 84-87, AAAI Press, Menlo Park, Calif., 1996.

1.
J.  Schmidhuber. Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-... hook. Diploma thesis, Institut für Informatik, Technische Universität München, 1987.


LOW-COMPLEXITY ART AND BEAUTY

The concept of "Low-Complexity Art" was introduced in 1994. Low-complexity art is the computer age equivalent of minimal art. It is art with low Kolmogorov complexity - art that can be generated by a short algorithm. This is related to a simple theory of beauty, which essentially claims: among several patterns classified as "comparable" by some subjective observer, the subjectively most beautiful is the one with the simplest (shortest) description, given the observer's particular method for encoding and memorizing it.

5.
J. Schmidhuber. Facial beauty and fractal geometry. Note IDSIA-28-98, IDSIA, June 1998 (1.29M, ca. 4.96 M gunzipped). HTML version (ca. 450K, including 5 color figures).

4.
J. Schmidhuber. Low-Complexity Art. Leonardo, Journal of the International Society for the Arts, Sciences, and Technology, 30(2):97-103, MIT Press, 1997. Print on high-resolution (600 dpi) printer, preferrably double paged on A4 paper (172 K, uncompresses to 1.1 M). HTML version.

3.
J. Schmidhuber. Femmes Fractales. Report IDSIA-99-97, IDSIA, December 1997.

2.
J.  Schmidhuber. Algorithmisch einfache Kunst. Manuscript, 1994.

1.
J.  Schmidhuber. Low-Complexity Art. Report FKI-197-94, Fakultät für Informatik, Technische Universität München, 1994.

LIFE, UNIVERSE, AND EVERYTHING

In the beginning the Great Programmer wrote a program that computes all computable universes instead of just ours. This greatly simplified things.

1.
J.  Schmidhuber. A computer scientist's view of life, the universe, and everything. In C. Freksa, M. Jantzen, and R. Valk, eds., Foundations of Computer Science: Potential - Theory - Cognition, Lecture Notes in Computer Science, pages 201-208, Springer, 1997. HTML version.

Back to

*