Juergen Schmidhuber's publications by topic
JÜRGEN SCHMIDHUBER'S PUBLICATIONS BY TOPIC (last update 1998)
Most of the files below are gzipped postscripts.
Decompress them with "gunzip".
Alternatively choose
publications by type and date.
Note: some papers address several issues - that's
why they are repeatedly listed under different topics.
Please do not hesitate to contact juergen@idsia.ch in case this
page contains statements that you consider false or misleading.
LOW-COMPLEXITY NEURAL NETWORKS, GENERALIZATION, OCCAM's RAZOR
Many of our machine learning algorithms, in one way or another,
discover and exploit initially unknown
environmental regularities. Regularity
implies algorithmic compressibility - inductive
learning and generalization are closely related to
data compression.
For instance, a "minimum description length"-based argument
shows that flat minima of typical neural network error functions
correspond to low expected overfitting/high generalization.
In applications to stock market prediction, flat minimum
search [2-3, 5] (with Sepp Hochreiter)
outperforms other widely used competitors.
In related but perhaps even more ambitious work a derivate
of Levin's universal search algorithm is used to discover
neural nets with low Levin complexity, low Kolmogorov complexity,
and high generalization capability [1,4]. At least with certain toy
problems where it is computationally feasible, the method can
lead to generalization results unmatchable by traditional neural net
algorithms.
- 5.
- S. Hochreiter and J. Schmidhuber.
Feature extraction through LOCOCODE.
Neural Computation 11(3): 679-714, 1999
(28 pages, 20 figures, 703 K, 4.9 M gunzipped).
- 4.
-
J. Schmidhuber.
Discovering neural nets with low Kolmogorov complexity
and high generalization capability.
Neural Networks, 10(5):857-873, 1997 (123 K).
- 3.
-
S. Hochreiter and J. Schmidhuber.
Flat Minima.
Neural Computation, 9(1):1-42, 1997, (201 K).
- 2.
- S. Hochreiter and J. Schmidhuber.
Simplifying neural nets by discovering flat minima.
In G. Tesauro, D. S. Touretzky and T. K. Leen, eds.,
Advances in Neural Information Processing Systems 7,
pages 529-536.
MIT Press, Cambridge MA, 1995.
- 1.
- J. Schmidhuber.
Discovering solutions with low Kolmogorov complexity
and high generalization capability.
In A. Prieditis and S. Russell, editors, Machine Learning:
Proceedings of the Twelfth International Conference, pages 488-496. Morgan
Kaufmann Publishers, San Francisco, CA, 1995.
SUPERVISED RECURRENT NEURAL NETWORKS
We introduced various novel learning algorithms
for recurrent neural nets with time-varying inputs.
The most remarkable achievement so far is
Long Short-Term Memory (with Sepp Hochreiter, 1995 -), an
algorithm without many drawbacks of
previous approaches: LSTM can learn to bridge very long
time lags by enforcing constant error flow back through
time. In experimental comparisons with competing approaches,
LSTM leads to many more successful runs, and learns much faster.
Earlier work on "History Compression"
exploited regularities in partly predictable symbol
strings to accelerate sequence classification with
recurrent nets. An implementation based on hierarchical
recurrent nets easily outperforms previous approaches
when it comes to learning regular grammars from symbol
sequences with long time lags between occurrences of
relevant symbols (but there have to be local regularities).
Several papers on other recurrent net aspects are included below.
- 16.
-
S. Hochreiter and J. Schmidhuber.
Long Short-Term Memory.
Neural Computation, 9(8):1735-1780, 1997 (170 K).
- 15.
- S. Hochreiter and J. Schmidhuber.
LSTM can solve hard long time lag problems.
In M. C. Mozer, M. I. Jordan, T. Petsche, eds.,
Advances in Neural Information Processing Systems 9,
pages 473-479, MIT Press, Cambridge MA, 1997.
- 14.
- S. Hochreiter and J. Schmidhuber.
Bridging long time lags by weight guessing and "Long Short-Term
Memory".
In F. L. Silva, J. C. Principe, L. B. Almeida, eds.,
Frontiers in Artificial Intelligence and Applications, Volume 37,
pages 65-72, IOS Press, Amsterdam, Netherlands, 1996.
- 13.
- J. Schmidhuber and S. Hochreiter.
Guessing can outperform many long time lag algorithms.
Technical Note IDSIA-19-96, IDSIA, May 1996.
- 12.
- J. Schmidhuber.
A self-referential weight matrix.
In Proceedings of the International Conference on Artificial
Neural Networks, Amsterdam, pages 446-451. Springer, 1993.
- 11.
- J. Schmidhuber.
On decreasing the ratio between learning complexity and number of
time-varying variables in fully recurrent nets.
In Proceedings of the International Conference on Artificial
Neural Networks, Amsterdam, pages 460-463. Springer, 1993.
- 10.
- J. Schmidhuber.
Netzwerkarchitekturen, Zielfunktionen und Kettenregel.
(Net architectures, objective functions, and chain rule.)
Habilitationsschrift (postdoctoral thesis),
Institut für Informatik, Technische Universität
München, 1993.
Part I (196 K),
Part II (148 K),
Part III (213 K).
- 9.
-
J. Schmidhuber.
Learning complex,
extended sequences using the principle of history compression.
Neural Computation, 4(2):234-242, 1992 (41 K).
- 8.
- J. Schmidhuber.
Learning unambiguous reduced sequence descriptions.
In J. E. Moody, S. J. Hanson, and R. P. Lippman, editors,
Advances in Neural Information Processing Systems 4, pages 291-298. San
Mateo, CA: Morgan Kaufmann, 1992.
- 7.
-
J. Schmidhuber.
A fixed size
storage O(n^3) time complexity learning algorithm for fully recurrent
continually running networks.
Neural Computation, 4(2):243-248, 1992 (33 K).
- 6.
-
J. Schmidhuber.
Learning to
control fast-weight memories: An alternative to recurrent nets.
Neural Computation, 4(1):131-139, 1992 (39 K).
- 5.
- J. Schmidhuber.
Learning temporary variable binding with dynamic links.
In Proc. International Joint Conference on Neural Networks,
Singapore, volume 3, pages 2075-2079. IEEE, 1991.
- 2.
- J. Schmidhuber.
Learning algorithms for networks with internal and external feedback.
In D. S. Touretzky, J. L. Elman, T. J. Sejnowski,
and G. E. Hinton,
editors, Proc. of the 1990 Connectionist Models Summer School, pages
52-61. San Mateo, CA: Morgan Kaufmann, 1990.
- 1.
- J. Schmidhuber.
Dynamische neuronale Netze und das fundamentale raumzeitliche
Lernproblem. (Dynamic neural nets and the fundamental spatio-temporal
credit assignment problem.) Dissertation,
Institut für Informatik, Technische
Universität München, 1990.
UNSUPERVISED NEURAL NETWORKS, REDUNDANCY REDUCTION, ICA
To my knowledge,
"Predictability Minimization" (PM) [2,5,7,10,11,12,14,17] is
the first non-linear neural algorithm for encoding input data consisting
of non-linear mixtures of basic features by "factorial"
codes with statistically independent components
(ICA stands for "independent component analysis").
PM is a co-evolutionary, unsupervised learning algorithm
based on neural feature detectors
and predictors that fight each other (1991 - ).
PM has various potential advantages over other
neural methods for redundancy reduction.
When applied to image data, PM
automatically comes up with
feature detectors reminiscent of those in biological systems (such as
orientation sensitive edge detectors, on-center-off-surround detectors,
bar detectors).
An alternative method called
LOCOCODE (1995 - ) performs ICA as a by-product
of discovering simple networks (with low information-theoretic
complexity) coding the input data [15,16,19-22]. It can outperform
previous methods for ICA and PCA, and establishes
a link between regularization and unsupervised learning.
Automatic sequence compression methods [1,3,4,8] also can be
classified as unsupervised coding approaches.
The "Neural Heat Exchanger" [13] (presented in talks since 1990) is a
supervised variant of Hinton and Dayan's 1994 "Helmholtz machine".
- 22.
- S. Hochreiter and J. Schmidhuber.
Source separation as a by-product of regularization.
To be presented at NIPS'98, 1998.
- 21.
- S. Hochreiter and J. Schmidhuber.
LOCOCODE performs nonlinear ICA without knowing the
number of sources. To be presented at
ICA'99, January 11-15, 1999.
- 20.
- S. Hochreiter and J. Schmidhuber.
Feature extraction through LOCOCODE.
Neural Computation 11(3): 679-714, 1999
(28 pages, 20 figures, 703 K, 4.9 M gunzipped).
- 19.
- S. Hochreiter and J. Schmidhuber.
LOCOCODE versus PCA and ICA.
In Proceedings of the International Conference on
Artificial Neural Networks, Sweden,
Springer, to appear 1998.
- 18.
- J. Schmidhuber.
Neural predictors for detecting and removing redundant information.
In H. Cruse, J. Dean, and H. Ritter, editors, Adaptive Behavior
and Learning. Kluwer, 1998, in preparation.
- 17.
- M. Eldracher, N. N. Schraudolph, and J. Schmidhuber,
Processing
Images by Semi-Linear Predictability
Minimization.
Technical Report IDSIA-77-97, 1997.
- 16.
- S. Hochreiter and J. Schmidhuber.
Low-complexity coding and decoding. In
K. M. Wong, I. King, D. Yeung, eds.,
Theoretical Aspects of Neural Computation: a Multidisciplinary Perspective,
pages 297-306, Springer, 1997.
- 15.
- S. Hochreiter and J. Schmidhuber.
Unsupervised coding with LOCOCODE.
In W. Gerstner, A. Germond, M. Hasler, J.-D. Nicoud, eds.,
Proceedings of the International Conference on
Artificial Neural Networks, Lausanne, Switzerland,
Springer, 655-660, 1997.
- 14.
-
J. Schmidhuber and M. Eldracher and B. Foltin.
Semilinear predictability minimzation produces well-known
feature detectors.
Neural Computation, 8(4):773-786, 1996 (260 K).
- 13.
- J. Schmidhuber.
The Neural Heat Exchanger.
In S. Amari, L. Xu, L. Chan, I. King, K. Leung, eds.,
Progress in Neural Information
Processing: Proceedings of the Intl. Conference
on Neural Information Processing, pages 194-197,
Springer, Hongkong, 1996. Earlier presentations
in talks at universities since 1990.
- 12.
- J. Schmidhuber and B. Foltin.
Semilinear predictability minimization produces orientation
sensitive edge detectors.
Technical Report FKI-201-94, Fakultät für Informatik,
Technische Universität München, December 1994.
- 11.
-
J. Schmidhuber and D. Prelinger.
Discovering
predictable classifications.
Neural Computation, 5(4):625-635, 1993 (51 K).
- 10.
- J. Schmidhuber and D. Prelinger.
Unsupervised extraction of predictable abstract features.
In Proceedings of the International Conference on Artificial
Neural Networks, Amsterdam, pages 601-604. Springer, 1993.
- 9.
- J. Schmidhuber and D. Prelinger.
A novel unsupervised classification method.
In Proc. of the Intl. Conf. on Artificial Neural Networks,
Brighton, pages 91-96. IEE, 1993.
- 8.
- J. Schmidhuber, M. C. Mozer, and D. Prelinger.
Continuous history compression.
In H. Hüning, S. Neuhauser,
M. Raus, and W. Ritschel, editors,
Proc. of Intl. Workshop on Neural Networks, RWTH Aachen, pages 87-95.
Augustinus, 1993.
- 7.
-
J. Schmidhuber.
Learning factorial
codes by predictability minimization.
Neural Computation, 4(6):863-879, 1992 (53 K).
- 6.
-
J. Schmidhuber.
Learning complex,
extended sequences using the principle of history compression.
Neural Computation, 4(2):234-242, 1992 (41 K).
- 5.
- J. Schmidhuber and D. Prelinger.
Discovering predictable classifications.
Technical Report CU-CS-626-92, Dept. of Comp. Sci., University of
Colorado at Boulder, November 1992.
- 4.
- J. Schmidhuber.
Learning unambiguous reduced sequence descriptions.
In J. E. Moody, S. J. Hanson, and R. P. Lippman, editors,
Advances in Neural Information Processing Systems 4, pages 291-298. San
Mateo, CA: Morgan Kaufmann, 1992.
- 3.
- J. Schmidhuber.
Adaptive decomposition of time.
In T. Kohonen, K. Mäkisara,
O. Simula, and J. Kangas, editors,
Artificial Neural Networks, pages 909-914. Elsevier Science Publishers
B.V., North-Holland, 1991.
- 2.
- J. Schmidhuber.
Learning factorial codes by predictability minimization.
Technical Report CU-CS-565-91, Dept. of Comp. Sci., University of
Colorado at Boulder, December 1991.
- 1.
- J. Schmidhuber.
Neural sequence chunkers.
Technical Report FKI-148-91, Institut für Informatik, Technische
Universität München, April 1991.
SEQUENCE COMPRESSION
An important special case of redundancy reduction
(compare section on unsupervised learning and ICA).
- 8.
- J. Schmidhuber and S. Heil.
Compressing texts with neural nets. In
Dale, Moisl and Somers, eds.,
Handbook of Natural Language Processing,
Marcel Dekker, Inc.,
to appear 1998.
- 7.
-
J. Schmidhuber and S. Heil.
Sequential neural text compression.
IEEE Transactions on Neural Networks,
7(1):142-146, 1996 (68 K).
- 6.
- J. Schmidhuber and S. Heil.
Predictive coding with neural nets: Application to text compression.
In G. Tesauro, D. S. Touretzky and T. K. Leen, eds.,
Advances in Neural Information Processing Systems 7, pages 1047-1054.
MIT Press, Cambridge MA, 1995.
- 5.
- J. Schmidhuber, M. C. Mozer, and D. Prelinger.
Continuous history compression.
In H. Hüning, S.
Neuhauser, M. Raus, and W. Ritschel, editors,
Proc. of Intl. Workshop on Neural Networks, RWTH Aachen, pages 87-95.
Augustinus, 1993.
- 4.
-
J. Schmidhuber.
Learning complex,
extended sequences using the principle of history compression.
Neural Computation, 4(2):234-242, 1992 (41 K).
- 3.
- J. Schmidhuber.
Learning unambiguous reduced sequence descriptions.
In J. E. Moody, S. J. Hanson, and R. P. Lippman, editors,
Advances in Neural Information Processing Systems 4, pages 291-298. San
Mateo, CA: Morgan Kaufmann, 1992.
- 2.
- J. Schmidhuber.
Adaptive history compression for learning to divide and conquer.
In Proc. International Joint Conference on Neural Networks,
Singapore, volume 2, pages 1130-1135. IEEE, 1991.
- 1.
- J. Schmidhuber.
Adaptive decomposition of time.
In T. Kohonen, K. Mäkisara,
O. Simula, and J. Kangas, editors,
Artificial Neural Networks, pages 909-914. Elsevier Science Publishers
B.V., North-Holland, 1991.
- 0.
- J. Schmidhuber.
Neural sequence chunkers.
Technical Report FKI-148-91, Institut für Informatik, Technische
Universität München, April 1991.
STOCK MARKET PREDICTION
Our most lucrative application.
- 2.
-
S. Hochreiter and J. Schmidhuber.
Flat Minima.
Neural Computation, 9(1):1-42, 1997, (201 K).
- 1.
- S. Hochreiter and J. Schmidhuber.
Simplifying neural nets by discovering flat minima.
In G. Tesauro, D. S. Touretzky and T. K. Leen, eds.,
Advances in Neural Information Processing Systems 7,
pages 529-536.
MIT Press, Cambridge MA, 1995.
REINFORCEMENT LEARNING IN FULLY OBSERVABLE WORLDS
Most work in mainstream reinforcement learning assumes that the
learner's current input tells it everything about
the environmental state. This is often unrealistic but
makes things much easier.
- 4.
-
M. Wiering and J. Schmidhuber.
Fast online Q(lambda).
Machine Learning, accepted 1998 (80 K).
- 3.
- M. Wiering and J. Schmidhuber.
Efficient model-based exploration.
In From Animals to Animats 5: Proceedings
of the Fifth International Conference on Simulation of Adaptive
Behavior, 1998, in press.
- 2.
- J. Storck, S. Hochreiter, and J. Schmidhuber.
Reinforcement-driven information acquisition in non-deterministic
environments.
In Proc. ICANN'95, vol. 2, pages 159-164.
EC2 & CIE, Paris, 1995.
- 1.
- J. Schmidhuber.
Curious model-building control systems.
In Proc. International Joint Conference on Neural Networks,
Singapore, volume 2, pages 1458-1463. IEEE, 1991.
REINFORCEMENT LEARNING IN PARTIALLY OBSERVABLE WORLDS
Many of our learning agents have an internal state that they
can use to memorize important events. The question
is: how can they learn to identify and store
those events relevant for further optimal action
selection? To address this issue we have studied reinforcement
learners with
(a) recurrent neural net value function approximators (1990 -),
(b) recurrent neural net world models (1990 -),
(c) actions that address and set internal storage cells, trained by
the success-story algorithm (1994 -),
(d) direct search in a space of event-memorizing algorithms,
(e) other things.
- 21.
-
M. Wiering and J. Schmidhuber.
HQ-Learning.
Adaptive Behavior 6(2):219-246, 1998 (122 K).
- 20.
- R. Salustowicz and M. Wiering and J. Schmidhuber.
Learning team strategies: soccer case studies.
Machine Learning, to appear 1998 (127 K).
- 19.
-
J. Schmidhuber, J. Zhao, and M. Wiering.
Shifting inductive bias with success-story algorithm,
adaptive Levin search, and incremental self-improvement.
Machine Learning 28:105-130, 1997.
- 18.
- J. Schmidhuber, J. Zhao, N. Schraudolph.
Reinforcement learning with self-modifying policies.
In S. Thrun and L. Pratt, eds.,
Learning to learn, Kluwer, pages 293-309, 1997.
- 17.
-
R. Salustowicz and J. Schmidhuber.
Probabilistic incremental program evolution.
Evolutionary Computation, 5(2):123-141, 1997.
- 16.
- M. Wiering and J. Schmidhuber.
Solving POMDPs using Levin search and EIRA.
In L. Saitta, ed.,
Machine Learning:
Proceedings of the 13th International Conference,
pages 534-542,
Morgan Kaufmann Publishers, San Francisco, CA, 1996.
- 15.
- M. Wiering and J. Schmidhuber.
HQ-Learning: Discovering Markovian subgoals for non-Markovian
reinforcement learning.
Technical Report IDSIA-95-96, IDSIA, October 1996.
- 14.
- J. Schmidhuber and J. Zhao and M. Wiering.
Simple principles of metalearning.
Technical Report IDSIA-69-96, IDSIA, June 1996.
- 13.
- J. Schmidhuber.
On learning how to learn learning strategies.
Technical Report FKI-198-94, Fakultät für Informatik,
Technische Universität München, November 1994.
- 12.
- J. Schmidhuber.
Reinforcement learning in Markovian and non-Markovian environments.
In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors,
Advances in Neural Information Processing Systems 3, pages 500-506. San
Mateo, CA: Morgan Kaufmann, 1991.
- 11.
-
J. Schmidhuber and R. Huber.
Learning to
generate artificial fovea trajectories for target detection.
International Journal of Neural Systems, 2(1 & 2):135-141, 1991
(50 K - figures omitted!).
- 10.
- J. Schmidhuber and R. Huber.
Using sequential adaptive neuro-control for efficient learning of
rotation and translation invariance.
In T. Kohonen, K. Mäkisara,
O. Simula, and J. Kangas, editors,
Artificial Neural Networks, pages 315-320. Elsevier Science Publishers
B.V., North-Holland, 1991.
- 9.
- J. Schmidhuber.
Learning algorithms for networks with internal and external feedback.
In D. S. Touretzky, J. L. Elman,
T. J. Sejnowski, and G. E. Hinton,
editors, Proc. of the 1990 Connectionist Models Summer School, pages
52-61. San Mateo, CA: Morgan Kaufmann, 1990.
- 8.
- J. Schmidhuber.
An on-line algorithm for dynamic reinforcement learning and planning
in reactive environments.
In Proc. IEEE/INNS International Joint Conference on Neural
Networks, San Diego, volume 2, pages 253-258, 1990.
- 7.
- J. Schmidhuber.
Reinforcement learning with interacting continually running fully
recurrent networks.
In Proc. INNC International Neural Network Conference, Paris,
volume 2, pages 817-820, 1990.
- 6.
- J. Schmidhuber.
Temporal-difference-driven learning in recurrent networks.
In R. Eckmiller, G. Hartmann, and G. Hauske, editors, Parallel
Processing in Neural Systems and Computers, pages 209-212. North-Holland,
1990.
- 5.
- J. Schmidhuber.
Reinforcement-Lernen und adaptive Steuerung.
Nachrichten Neuronale Netze, 2:1-3, 1990.
- 4.
- J. Schmidhuber.
Making the world differentiable: On using fully recurrent
self-supervised neural networks for dynamic reinforcement learning and
planning in non-stationary environments.
Technical Report FKI-126-90, Institut für Informatik,
Technische Universität München, February 1990 (revised in November).
- 3.
- J. Schmidhuber.
Networks adjusting networks.
In J. Kindermann and A. Linden, editors, Proceedings of
`Distributed Adaptive Neural Information Processing', St.Augustin, 24.-25.5.
1989, pages 197-208. Oldenbourg, 1990.
Extended version: TR FKI-125-90 (revised),
Institut für Informatik, TUM.
- 2.
- J. Schmidhuber.
Dynamische neuronale Netze und das fundamentale raumzeitliche
Lernproblem. (Dynamic neural nets and the fundamental spatio-temporal
credit assignment problem.) Dissertation,
Institut für Informatik, Technische
Universität München, 1990.
- 1.
-
J. Schmidhuber.
A local learning algorithm for dynamic feedforward and
recurrent networks.
Connection Science, 1(4):403-412, 1989.
(The Neural Bucket Brigade, 43 K - figures omitted!).
EXPLORATION - WHAT'S INTERESTING?
We introduced the first active reinforcement learning methods
translating mismatches between expectations and reality
into reinforcement for "curious", exploring
agents (1991 - ): our agents like to go
where they expect to learn something.
One recent focus is on exploring
the space of general algorithms (as opposed to
traditional simple grid-worlds).
- 10.
- J. Schmidhuber.
What's interesting?
In Abstract Collection of SNOWBIRD:
Machines That Learn.
Utah, April 1998.
- 9.
- M. Wiering and J. Schmidhuber.
Efficient model-based exploration.
In From Animals to Animats 5: Proceedings
of the Fifth International Conference on Simulation of Adaptive
Behavior, 1998, in press.
- 8.
- M. Wiering and J. Schmidhuber.
Learning exploration policies with models.
To appear in Proc. CONALD, 1998.
- 7.
- J. Schmidhuber.
What's interesting?
Technical Report IDSIA-35-97, IDSIA, July 1997
(23 pages, 10 figures, 157 K, 834 K gunzipped).
- 6.
- J. Schmidhuber, J. Zhao, N. Schraudolph.
Reinforcement learning with self-modifying policies.
In S. Thrun and L. Pratt, eds.,
Learning to learn, Kluwer, pages 293-309, 1997.
- 5.
- J. Storck, S. Hochreiter, and J. Schmidhuber.
Reinforcement-driven information acquisition in non-deterministic
environments.
In Proc. ICANN'95, vol. 2, pages 159-164.
EC2 & CIE, Paris, 1995.
- 4.
- J. Schmidhuber.
On learning how to learn learning strategies.
Technical Report FKI-198-94, Fakultät für Informatik,
Technische Universität München, November 1994.
- 3.
- J. Schmidhuber.
Curious model-building control systems.
In Proc. International Joint Conference on Neural Networks,
Singapore, volume 2, pages 1458-1463. IEEE, 1991.
- 2.
- J. Schmidhuber.
Adaptive curiosity and adaptive confidence.
Technical Report FKI-149-91, Institut für Informatik, Technische
Universität München, April 1991.
- 1.
- J. Schmidhuber.
A possibility for implementing curiosity and boredom in
model-building neural controllers.
In J. A. Meyer and S. W. Wilson, editors, Proc. of the
International Conference on Simulation of Adaptive Behavior: From Animals to
Animats, pages 222-227. MIT Press/Bradford Books, 1991.
SUBGOAL DISCOVERY
There is no teacher providing useful intermediate
subgoals for our reinforcement learning systems. Refs [1-4]
use gradient-based subgoal generators, refs [5-6]
search in discrete subgoal space.
- 6.
-
M. Wiering and J. Schmidhuber.
HQ-Learning.
Adaptive Behavior 6(2):219-246, 1998 (122 K).
- 5.
- M. Wiering and J. Schmidhuber.
HQ-Learning: Discovering Markovian subgoals for non-Markovian
reinforcement learning.
Technical Report IDSIA-95-96, IDSIA, October 1996.
- 4.
- J. Schmidhuber.
Netzwerkarchitekturen, Zielfunktionen und Kettenregel.
(Net architectures, objective functions, and chain rule.)
Habilitationsschrift (postdoctoral thesis),
Institut für Informatik, Technische Universität
München, 1993.
Part I (196 K),
Part II (148 K),
Part III (213 K).
- 3.
- J. Schmidhuber and R. Wahnsiedler.
Planning simple trajectories using neural subgoal generators.
In J. A. Meyer, H. L. Roitblat, and S. W. Wilson, editors, Proc.
of the 2nd International Conference on Simulation of Adaptive Behavior,
pages 196-202. MIT Press, 1992.
- 2.
- J. Schmidhuber.
Learning to generate sub-goals for action sequences.
In T. Kohonen, K. Mäkisara,
O. Simula, and J. Kangas, editors,
Artificial Neural Networks, pages 967-972. Elsevier Science Publishers
B.V., North-Holland, 1991.
- 1.
- J. Schmidhuber.
Towards compositional learning with dynamic neural networks.
Technical Report FKI-129-90, Institut für Informatik, Technische
Universität München, 1990.
GENETIC PROGRAMMING
Genetic Programming (GP) uses Genetic Algorithms
to evolve computer programs. GP was invented
by Nichael Cramer in 1985. We were not aware of
his work when in 1987 we described the first (to our knowledge)
"modern" GP approach directly
operating on variable-length code [1].
We applied our system to simple tasks including
the "lawnmower problem", later also studied
by Koza (1994), whose home page and books
on GP unfortunately leave the impression
that he was the first to invent GP.
Pages 7-13 of ref [2]
are devoted to an extended GP approach that
recursively applies metalevel GP to the task of finding
better program-modifying programs
on lower levels - the goal is to use GP for improving GP.
- 2.
- J. Schmidhuber.
Evolutionary principles in self-referential learning, or on learning
how to learn: The meta-meta-... hook. Diploma thesis,
Institut für Informatik, Technische Universität München, 1987.
- 1.
- D. Dickmanns, J. Schmidhuber, and A. Winklhofer.
Der genetische Algorithmus: Eine Implementierung in Prolog.
Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl
Prof. Radig, Technische Universität München, 1987.
PROGRAM EVOLUTION
There may be much better ways of evolving computer programs than GP's.
Our contributions include Adaptive Levin Search (extending
Levin's universal search algorithm, which is theoretically optimal for
non-incremental search), and
Probabilistic Incremental Program Evolution (PIPE).
- 10.
- R. Salustowicz and J. Schmidhuber.
Learning to predict through PIPE and automatic task decomposition.
Technical Report IDSIA-11-98, IDSIA, April 1998.
- 9.
- R. Salustowicz and M. Wiering and J. Schmidhuber.
Learning team strategies: soccer case studies.
Machine Learning, to appear 1998 (127 K).
- 8.
- R. Salustowicz and J. Schmidhuber.
Evolving structured programs with hierarchical
instructions and skip nodes.
Machine Learning:
Proceedings of the 15th International Conference,
Morgan Kaufmann Publishers, San Francisco, CA,
to appear 1998.
- 7.
-
J. Schmidhuber, J. Zhao, and M. Wiering.
Shifting inductive bias with success-story algorithm,
adaptive Levin search, and incremental self-improvement.
Machine Learning 28:105-130, 1997.
- 6.
-
R. Salustowicz and J. Schmidhuber.
Probabilistic incremental program evolution.
Evolutionary Computation, 5(2):123-141, 1997.
- 5.
-
J. Schmidhuber.
Discovering neural nets with low Kolmogorov complexity
and high generalization capability.
Neural Networks, 10(5):857-873, 1997 (123 K).
- 4.
- R. Salustowicz and J. Schmidhuber.
Probabilistic incremental program evolution:
stochastic search through program space. In van Someren, M.,
Widmer, G., editors, Machine Learning: ECML-97,
Lecture Notes in Artificial Intelligence 1224,
pages 213-220, Springer, 1997.
- 3.
- M. Wiering and J. Schmidhuber.
Solving POMDPs using Levin search and EIRA.
In L. Saitta, ed.,
Machine Learning:
Proceedings of the 13th International Conference,
pages 534-542,
Morgan Kaufmann Publishers, San Francisco, CA, 1996.
- 2.
- J. Schmidhuber.
Evolutionary principles in self-referential learning, or on learning
how to learn: The meta-meta-... hook. Diploma thesis,
Institut für Informatik, Technische Universität München, 1987.
- 1.
- D. Dickmanns, J. Schmidhuber, and A. Winklhofer.
Der genetische Algorithmus: Eine Implementierung in Prolog.
Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl
Prof. Radig, Technische Universität München, 1987.
METALEARNING (LEARNING TO LEARN)
Metalearning means learning the credit
assignment method itself.
Metalearning may be
the most ambitious but also the most
rewarding goal of machine learning.
There are few limits to what
a good metalearner will learn.
Where appropriate it will learn to
learn by analogy, by chunking, by planning,
by subgoal generation, by combinations
thereof - you name it.
Don't be misled by confusion in some of the recent literature:
"learning to learn" is orthogonal to "inductive transfer" (learning
from successive tasks).
Most of our approaches to metalearning are based
on self-modifying policies (SMPs).
The learning algorithm of an SMP
is part of the SMP itself - SMPs can modify the way
they modify themselves.
We have introduced several ways of forcing SMPs to come
up with better and better self-modification algorithms:
(a) The "success-story algorithm" [6-14],
(b) Gradient calculation in recurrent nets [2-5],
(c) Market models of the mind inspired by Holland's bucket brigade [1],
(d) Genetic Programming on recursive meta-levels [1].
- 14.
- J. Schmidhuber.
A general method for incremental self-improvement
and multiagent learning.
In X. Yao, editor, Evolutionary Computation: Theory and Applications.
Chapter 3, pp.81-123, Scientific Publ. Co., Singapore,
1999 (submitted 1996).
- 13.
- J. Schmidhuber, J. Zhao, N. Schraudolph.
Reinforcement learning with self-modifying policies.
In S. Thrun and L. Pratt, eds.,
Learning to learn, Kluwer, pages 293-309, 1997.
- 12.
-
J. Schmidhuber, J. Zhao, and M. Wiering.
Shifting inductive bias with success-story algorithm,
adaptive Levin search, and incremental self-improvement.
Machine Learning 28:105-130, 1997.
- 11.
- J. Zhao and J. Schmidhuber.
Solving a complex prisoner's dilemma
with self-modifying policies.
In From Animals to Animats 5: Proceedings
of the Fifth International Conference on Simulation of Adaptive
Behavior, 1998, in press.
- 10.
- J. Schmidhuber and J. Zhao and M. Wiering.
Simple principles of metalearning.
Technical Report IDSIA-69-96, IDSIA, June 1996.
- 9.
- M. Wiering and J. Schmidhuber.
Solving POMDPs using Levin search and EIRA.
In L. Saitta, ed.,
Machine Learning:
Proceedings of the 13th International Conference,
pages 534-542,
Morgan Kaufmann Publishers, San Francisco, CA, 1996.
- 8.
- J. Schmidhuber.
Environment-independent reinforcement acceleration
(invited talk at Hongkong University of Science and Technology).
Technical Note IDSIA-59-95, June 1995.
- 7.
- J. Schmidhuber.
Beyond "Genetic Programming": Incremental Self-Improvement.
In J. Rosca, ed., Proc. Workshop on Genetic Programming at ML95,
pages 42-49. National Resource Lab for the study of Brain and Behavior,
1995.
- 6.
- J. Schmidhuber.
On learning how to learn learning strategies.
Technical Report FKI-198-94, Fakultät für Informatik,
Technische Universität München, November 1994.
- 5.
- J. Schmidhuber.
A neural network that embeds its own meta-levels.
In Proc. of the International Conference on Neural Networks '93,
San Francisco. IEEE, 1993.
- 4.
- J. Schmidhuber.
An introspective network that can learn to run its own weight change
algorithm.
In Proc. of the Intl. Conf. on Artificial Neural Networks,
Brighton, pages 191-195. IEE, 1993.
- 3.
- J. Schmidhuber.
A self-referential weight matrix.
In Proceedings of the International Conference on Artificial
Neural Networks, Amsterdam, pages 446-451. Springer, 1993.
- 2.
- J. Schmidhuber.
Steps towards `self-referential' learning.
Technical Report CU-CS-627-92, Dept. of Comp. Sci., University of
Colorado at Boulder, November 1992.
- 1.
- J. Schmidhuber.
Evolutionary principles in self-referential learning, or on learning
how to learn: The meta-meta-... hook. Diploma thesis,
Institut für Informatik, Technische Universität München, 1987.
MARKET MODELS FOR MACHINE LEARNING
Pages 23-51 of ref [1] are devoted to a reinforcement learning
approach called "prototypical self-referential learning
mechanisms" (PSALM 1 - PSALM 3).
PSALMs use competing "metalearning" agents
with actions for generating and connecting agents and for assigning credit to
agents, subject to the constraint that total credit is conserved
(except for external reward and consumption).
Ref [3] describes a related but less general
"economy" of neurons inspired by Holland's
bucket brigade. External reward
pays incoming weights of currently active output units, active unit U's
outgoing weights to other active units pay to U's incoming weights (money
= weight substance). Competition stems from partitioning the set of units
into winner-take-all subsets.
- 3.
-
J. Schmidhuber.
A local learning algorithm for dynamic feedforward and
recurrent networks.
Connection Science, 1(4):403-412, 1989.
(The Neural Bucket Brigade, 43 K - figures omitted!).
- 2.
- J. Schmidhuber.
The neural bucket brigade.
In R. Pfeifer, Z. Schreter, Z. Fogelman,
and L. Steels, editors,
Connectionism in Perspective, pages 439-446. Amsterdam: Elsevier,
North-Holland, 1989.
- 1.
- J. Schmidhuber.
Evolutionary principles in self-referential learning, or on learning
how to learn: The meta-meta-... hook. Diploma thesis,
Institut für Informatik, Technische Universität München, 1987.
PROBABILISTIC PROGRAMMING LANGUAGES, SUCCESS-STORY ALGORITHM
My main focus has always been on letting the credit assignment
strategy improve itself (metalearning - see above).
Towards this end I have used
"self-referential" probabilistic programs whose instructions may modify
the underlying probability distribution. A backtracking scheme called
the success-story algorithm occasionally undoes "bad" self-generated
probability modifications and stabilizes "good" ones to make sure the
system accelerates reward intake in the long run.
- 8.
- J. Schmidhuber and J. Zhao.
Direct policy search and uncertain policy evaluation.
1999 AAAI Spring Symposium on Search
under Uncertain and Incomplete Information,
Stanford Univ., 1999.
- 7.
-
J. Schmidhuber, J. Zhao, and M. Wiering.
Shifting inductive bias with success-story algorithm,
adaptive Levin search, and incremental self-improvement.
Machine Learning 28:105-130, 1997.
- 6.
- J. Schmidhuber.
A general method for incremental self-improvement
and multiagent learning.
In X. Yao, editor, Evolutionary Computation: Theory and Applications.
Chapter 3, pp.81-123, Scientific Publ. Co., Singapore,
1999 (submitted 1996).
- 5.
- J. Schmidhuber, J. Zhao, N. Schraudolph.
Reinforcement learning with self-modifying policies.
In S. Thrun and L. Pratt, eds.,
Learning to learn, Kluwer, pages 293-309, 1997.
- 4.
- J. Schmidhuber and J. Zhao.
Multiagent learning with the success-story algorithm.
In G. Weiss, ed.,
Distributed Artificial Intelligence
Meets Machine Learning, pages 82-93,
Springer, Berlin, 1997.
- 3.
- J. Zhao and J. Schmidhuber.
Incremental self-improvement for
life-time multiagent reinforcement learning.
In Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack,
and Stewart W. Wilson, eds.,
From Animals to Animats 4: Proceedings
of the Fourth International Conference on Simulation of Adaptive
Behavior, pages 516-525, MIT Press, Bradford Books, Cambridge, MA, 1996.
- 2.
- J. Schmidhuber and J. Zhao and M. Wiering.
Simple principles of metalearning.
Technical Report IDSIA-69-96, IDSIA, June 1996.
- 1.
- J. Schmidhuber.
On learning how to learn learning strategies.
Technical Report FKI-198-94, Fakultät für Informatik,
Technische Universität München, November 1994.
MULTIAGENT LEARNING
Some of the approaches mentioned above are naturally
applicable to multiagent learning.
- 12.
- R. Salustowicz and M. Wiering and J. Schmidhuber.
Learning team strategies: soccer case studies.
Machine Learning, to appear 1998 (127 K).
- 11.
-
M. Wiering and J. Schmidhuber.
HQ-Learning.
Adaptive Behavior 6(2):219-246, 1998 (122 K).
- 10.
- M. Wiering and J. Schmidhuber.
CMAC Models Learn to Play Soccer.
In Proceedings of the International Conference on
Artificial Neural Networks, Sweden,
Springer, to appear 1998.
- 9.
- J. Zhao and J. Schmidhuber.
Solving a complex prisoner's dilemma
with self-modifying policies.
In From Animals to Animats 5: Proceedings
of the Fifth International Conference on Simulation of Adaptive
Behavior, 1998, in press.
- 8.
- J. Schmidhuber.
A general method for incremental self-improvement
and multiagent learning in unrestricted environments.
In X. Yao, editor, Evolutionary Computation: Theory and Applications.
Chapter 3, pp.81-123, Scientific Publ. Co., Singapore,
1999 (submitted 1996).
- 7.
- J. Schmidhuber, J. Zhao, N. Schraudolph.
Reinforcement learning with self-modifying policies.
In S. Thrun and L. Pratt, eds.,
Learning to learn, Kluwer, pages 293-309, 1997.
- 6.
- J. Schmidhuber and J. Zhao.
Multiagent learning with the success-story algorithm.
In G. Weiss, ed.,
Distributed Artificial Intelligence
Meets Machine Learning, pages 82-93,
Springer, Berlin, 1997.
- 5.
- R. Salustowicz and M. Wiering and J. Schmidhuber.
On learning soccer strategies.
In W. Gerstner, A. Germond, M. Hasler, J.-D. Nicoud, eds.,
Proceedings of the International Conference on
Artificial Neural Networks, Lausanne, Switzerland,
Springer, 769-774, 1997.
- 4.
- J. Zhao and J. Schmidhuber.
Incremental self-improvement for
life-time multiagent reinforcement learning.
In Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack,
and Stewart W. Wilson, eds.,
From Animals to Animats 4: Proceedings
of the Fourth International Conference on Simulation of Adaptive
Behavior, pages 516-525, MIT Press, Bradford Books, Cambridge, MA, 1996.
- 3.
- J. Schmidhuber.
Realistic multiagent reinforcement learning. In
G. Weiss, ed., Learning in Distributed
Artificial Intelligence Systems. Working Notes of the
1996 ECAI Workshop, 1996.
- 2.
- J. Schmidhuber.
A general method for multiagent learning
in unrestricted environments. In
1996 AAAI Syposium on Adaptation, Co-evolution and
Learning in Multiagent Systems, TR SS-96-01,
pages 84-87, AAAI Press, Menlo Park, Calif., 1996.
- 1.
- J. Schmidhuber.
Evolutionary principles in self-referential learning, or on learning
how to learn: The meta-meta-... hook. Diploma thesis,
Institut für Informatik, Technische Universität München, 1987.
LOW-COMPLEXITY ART AND BEAUTY
The concept of "Low-Complexity Art" was
introduced in 1994.
Low-complexity art is the computer age equivalent of
minimal art. It is art with low Kolmogorov complexity -
art that can be generated by a short algorithm.
This is related to
a simple theory of beauty, which essentially claims:
among several patterns classified as "comparable" by
some subjective observer, the subjectively most beautiful is
the one with the simplest (shortest) description, given the
observer's particular method for encoding and memorizing it.
- 5.
- J. Schmidhuber.
Facial beauty and fractal geometry.
Note IDSIA-28-98, IDSIA, June 1998 (1.29M, ca. 4.96 M gunzipped).
HTML version
(ca. 450K, including 5 color figures).
- 4.
-
J. Schmidhuber.
Low-Complexity Art.
Leonardo, Journal of the
International Society for the Arts, Sciences, and
Technology, 30(2):97-103, MIT Press, 1997.
Print on high-resolution (600 dpi) printer,
preferrably double paged on A4 paper
(172 K, uncompresses to 1.1 M).
HTML version.
- 3.
- J. Schmidhuber.
Femmes Fractales. Report IDSIA-99-97, IDSIA, December 1997.
- 2.
- J. Schmidhuber. Algorithmisch einfache Kunst. Manuscript, 1994.
- 1.
- J. Schmidhuber.
Low-Complexity Art.
Report FKI-197-94, Fakultät für Informatik, Technische
Universität München, 1994.
LIFE, UNIVERSE, AND EVERYTHING
In the beginning the Great Programmer wrote a
program that computes all computable universes
instead of just ours. This greatly simplified things.
- 1.
- J. Schmidhuber.
A computer scientist's view of life, the universe, and everything.
In C. Freksa, M. Jantzen, and R. Valk, eds.,
Foundations of Computer Science: Potential - Theory - Cognition,
Lecture Notes in Computer Science,
pages 201-208, Springer, 1997.
HTML version.
Back to