36.
J. Schmidhuber. Maximizing Fun By Creating Data With Easily Reducible Subjective Complexity.
In G. Baldassarre and M. Mirolli (eds.), Roadmap for Intrinsically Motivated Learning.
Springer, 2012, in press.
35.
J. Schmidhuber. A Formal Theory of Creativity to Model the Creation of Art.
In J. McCormack (ed.), Computational Creativity. MIT Press, 2012.
PDF of older preprint.
34.
L. Pape, C. M. Oddo, M. Controzzi, C. Cipriani, A. Foerster, M. C. Carrozza, J. Schmidhuber.
Learning tactile skills through curious exploration.
Frontiers in Neurorobotics 6:6, 2012, doi: 10.3389/fnbot.2012.00006
33.
H. Ngo, M. Luciw, A. Foerster, J. Schmidhuber.
Learning Skills from Play: Artificial Curiosity on a Katana Robot Arm.
Proc. IJCNN 2012.
PDF.
Video.
32.
V. R. Kompella, M. Luciw, M. Stollenga, L. Pape, J. Schmidhuber.
Autonomous Learning of Abstractions using Curiosity-Driven Modular Incremental Slow Feature Analysis.
Proc. IEEE Conference on Development and Learning / EpiRob 2012
(ICDL-EpiRob'12), San Diego, 2012, in press.
31.
R. K. Srivastava, B. Steunebrink, J. Schmidhuber.
First Experiments with PowerPlay.
Neural Networks, 2013.
ArXiv preprint (2012):
arXiv:1210.8385 [cs.AI].
30.
R. K. Srivastava, B. R. Steunebrink, M. Stollenga, J. Schmidhuber.
Continually Adding Self-Invented
Problems to the Repertoire: First
Experiments with POWERPLAY.
Proc. IEEE Conference on Development and Learning / EpiRob 2012
(ICDL-EpiRob'12), San Diego, 2012.
PDF.
29.
J. Schmidhuber.
POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem.
Frontiers in Cognitive Science, 2013.
ArXiv preprint (2011):
arXiv:1112.5309 [cs.AI]
28.
Sun Yi, F. Gomez, J. Schmidhuber.
Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments.
In Proc. Fourth Conference on Artificial General Intelligence (AGI-11),
Google, Mountain View, California, 2011.
PDF.
27.
V. Graziano, T. Glasmachers, T. Schaul, L. Pape, G. Cuccu, J. Leitner, J. Schmidhuber. Artificial Curiosity for Autonomous Space Exploration. Acta Futura 4:41-51, 2011 (DOI: 10.2420/AF04.2011.41). PDF.
26.
G. Cuccu, M. Luciw, J. Schmidhuber, F. Gomez.
Intrinsically Motivated Evolutionary Search for Vision-Based Reinforcement Learning.
In Proc. Joint IEEE International Conference on Development and Learning (ICDL) and on Epigenetic Robotics (ICDL-EpiRob 2011), Frankfurt, 2011.
PDF.
25.
M. Luciw, V. Graziano, M. Ring, J. Schmidhuber.
Artificial Curiosity with Planning for Autonomous Visual and Perceptual Development.
In Proc. Joint IEEE International Conference on Development and Learning (ICDL) and on Epigenetic Robotics (ICDL-EpiRob 2011), Frankfurt, 2011.
PDF.
24.
T. Schaul, L. Pape, T. Glasmachers, V. Graziano J. Schmidhuber.
Coherence Progress: A Measure of Interestingness Based on Fixed Compressors.
In Proc. Fourth Conference on Artificial General Intelligence (AGI-11),
Google, Mountain View, California, 2011.
PDF.
23.
T. Schaul, Yi Sun, D. Wierstra, F. Gomez, J. Schmidhuber. Curiosity-Driven Optimization. IEEE Congress on Evolutionary Computation (CEC-2011), 2011.
PDF.
22.
H. Ngo, M. Ring, J. Schmidhuber.
Curiosity Drive based on Compression Progress for Learning Environment Regularities.
In Proc. Joint IEEE International Conference on Development and Learning (ICDL) and on Epigenetic Robotics (ICDL-EpiRob 2011), Frankfurt, 2011.
21.
J. Schmidhuber. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990-2010). IEEE Transactions on Autonomous Mental Development, 2(3):230-247, 2010.
IEEE link.
PDF of draft.
20. J. Schmidhuber. Artificial Scientists & Artists Based on the Formal Theory of Creativity.
In
Proceedings of the Third Conference on Artificial General Intelligence (AGI-2010), Lugano, Switzerland.
PDF.
19. J. Schmidhuber. Art & science as by-products of the search for novel patterns, or data compressible in unknown yet learnable ways. In M. Botta (ed.), Multiple ways to design research. Research cases that reshape the design discipline, Milano-Lugano, Swiss Design Network - Et al. Edizioni, 2009, pp. 98-112. (Keynote talk.) PDF of preprint.
18. J. Schmidhuber.
Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes.
Based on keynote talk for KES 2008 (below) and joint invited
lecture for ALT 2007 / DS 2007 (below). Short version: ref 17 below. Long version in G. Pezzulo, M. V. Butz, O. Sigaud, G. Baldassarre, eds.: Anticipatory Behavior in Adaptive Learning Systems, from Sensorimotor to Higher-level Cognitive Capabilities, Springer, LNAI, 2009.
Preprint (2008, revised 2009): arXiv:0812.4360.
PDF (Dec 2008).
PDF (April 2009).
17. J. Schmidhuber.
Simple Algorithmic Theory of Subjective Beauty, Novelty, Surprise,
Interestingness, Attention, Curiosity, Creativity, Art,
Science, Music, Jokes. Journal of SICE, 48(1):21-32, 2009.
PDF.
16.
J. Schmidhuber.
Driven by Compression Progress. In Proc.
Knowledge- Based Intelligent Information and
Engineering Systems KES-2008,
Lecture Notes in Computer Science LNCS 5177, p 11, Springer, 2008.
(Abstract of invited keynote talk.)
PDF.
15.
J. Schmidhuber.
Simple Algorithmic Principles of Discovery, Subjective Beauty,
Selective Attention, Curiosity & Creativity.
In V. Corruble, M. Takeda, E. Suzuki, eds.,
Proc. 10th Intl. Conf. on Discovery Science (DS 2007)
p. 26-38, LNAI 4755, Springer, 2007.
Also in M. Hutter, R. A. Servedio, E. Takimoto, eds.,
Proc. 18th Intl. Conf. on Algorithmic Learning Theory (ALT 2007)
p. 32, LNAI 4754, Springer, 2007.
(Joint invited lecture for DS 2007 and ALT 2007, Sendai, Japan, 2007.)
Preprint: arxiv:0709.0674.
PDF.
Curiosity as the drive to improve the compression
of the lifelong sensory input stream: interestingness as
the first derivative of subjective "beauty" or compressibility.
14.
J. Schmidhuber.
Developmental Robotics,
Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts.
Connection Science, 18(2): 173-187, June 2006.
PDF.
On mathematically optimal universal artificial curiosity,
based on theoretically best possible ways of maximizing learning progress
in embedded agents or robots with an intrinsic motivation to
learn skills that lead to a better understanding of the world and what can be done in it.
It is also pointed out how music and the arts can be formally understood as a consequence of
the principle of artificial curiosity and creativity.
13.
J. Schmidhuber.
Self-Motivated Development Through
Rewards for Predictor Errors / Improvements.
Developmental Robotics 2005 AAAI Spring Symposium,
March 21-23, 2005, Stanford University, CA.
PDF.
12.
J. Schmidhuber.
Exploring the Predictable.
In Ghosh, S. Tsutsui, eds., Advances in Evolutionary Computing,
p. 579-612, Springer, 2002.
PDF .
HTML.
One of the key publications - see more details under refs [8, 11, 1997-].
11.
J . Schmidhuber.
Artificial Curiosity Based on Discovering Novel Algorithmic
Predictability Through Coevolution.
In P. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, Z.
Zalzala, eds., Congress on Evolutionary Computation, p. 1612-1618,
IEEE Press, Piscataway, NJ, 1999.
11a.
J. Schmidhuber.
What's interesting?
In Abstract Collection of SNOWBIRD:
Machines That Learn.
Utah, April 1998.
10.
M. Wiering and J. Schmidhuber.
Efficient model-based exploration.
In R. Pfeiffer, B. Blumberg, J. Meyer, S. W. Wilson, eds.,
From Animals to Animats 5: Proceedings
of the Fifth International Conference on Simulation of Adaptive
Behavior, p. 223-228, MIT Press, 1998.
9.
M. Wiering and J. Schmidhuber.
Learning exploration policies with models.
In Proc. CONALD, 1998.
8.
J. Schmidhuber.
What's interesting?
Technical Report IDSIA-35-97, IDSIA, July 1997
(23 pages, 10 figures, 157 K, 834 K gunzipped).
Here we focus
on automatic creation of predictable internal
abstractions of complex spatio- temporal events:
two competing, intrinsically motivated agents agree on essentially
arbitrary algorithmic experiments and bet
on their possibly surprising (not yet predictable)
outcomes in zero-sum games,
each agent potentially profiting from outwitting / surprising
the other by inventing experimental protocols where both
modules disagree on the predicted outcome. The focus is on exploring
the space of general algorithms (as opposed to
traditional simple mappings from inputs to
outputs); the
general system
[12] focuses on the interesting
things by losing interest in both predictable and
unpredictable aspects of the world. Unlike the previous
systems with intrinsic motivation (1990, 91, 95, see below), the system also
takes into account
the computational cost of learning new skills, learning when to learn and what to learn.
See also refs [11, 12, 1998-2002].
7.
J. Schmidhuber, J. Zhao, N. Schraudolph.
Reinforcement learning with self-modifying policies.
In S. Thrun and L. Pratt, eds.,
Learning to learn, Kluwer, pages 293-309, 1997.
PDF;
HTML.
6.
J. Storck, S. Hochreiter, and J. Schmidhuber.
Reinforcement-driven information acquisition in non-deterministic
environments.
In Proc. ICANN'95, vol. 2, pages 159-164.
EC2 & CIE, Paris, 1995.
PDF .
HTML.
In this paper the curiosity reward is
again proportional to
the predictor's surprise / information gain, this time measured as the
Kullback-Leibler distance between the learning predictor's
subjective probability distributions
before and after new observations -
the relative entropy between its prior and posterior.
(In 2005 Itti & Baldi called this "Bayesian surprise" and
demonstrated experimentally that it explains certain patterns of
human visual attention better than certain previous approaches.)
Note the differences to "Active Learning": The latter typically focuses on
choosing which data points to evaluate next in order to maximize information gain (i.e., one-step look-ahead) assuming all data point evaluations are equally costly. The 1995 system, however, is
more general and takes into account:
(1) arbitrary delays between experimental actions agents and corresponding information gains,
(2) the highly environment-dependent costs of obtaining or creating not
just individual data points but entire data sequences.
5.
J. Schmidhuber.
On learning how to learn learning strategies.
Technical Report FKI-198-94, Fakultät für Informatik,
Technische Universität München, November 1994.
PDF.
4.
J. Schmidhuber.
Curious model-building control systems.
In Proc. International Joint Conference on Neural Networks,
Singapore, volume 2, pages 1458-1463. IEEE, 1991.
PDF .
HTML.
The second peer-reviewed English-language publication
on artificial curious agents with intrinsic motivation. The system
uses reinforcement learning to create behaviors that lead to parts of the environment
where previous experience indicates that the prediction error can be improved (not necessarily
where it is high). So the agent is neither attracted by unpredictable
randomness nor by totally predictable aspects of the world. Instead
it likes to go where it learnt to expect additional learning progress.
(Quite a few later publications on developmental robotics and intrinsic reward took up this basic idea, e.g.,
Oudeyer & Kaplan (2007), whose work is restricted to one-step look-ahead though,
and doesn't allow for delayed intrinsic rewards like the 1991 paper above.)
3.
J. Schmidhuber.
J. Schmidhuber. Adaptive confidence and adaptive curiosity. Technical Report FKI-149-91, Inst. f. Informatik, Tech. Univ. Munich, April 1991.
PDF.
2.
J. Schmidhuber.
A possibility for implementing curiosity and boredom in
model-building neural controllers.
In J. A. Meyer and S. W. Wilson, editors, Proc. of the
International Conference on Simulation
of Adaptive Behavior: From Animals to
Animats, pages 222-227. MIT Press/Bradford Books, 1991.
PDF .
HTML.
The first peer-reviewed English-language publication
on artificial curious agents with intrinsic motivation. The system
uses reinforcement learning to create behaviors that lead the agent to parts of the environment
where the separate predictor's
prediction error is expected to be high, assuming one can learn something there.
Quite a few later publications on developmental robotics and/or intrinsic reward took up this basic idea, e.g.,
Singh & Barto & Chentanez (2005).
1.
J. Schmidhuber.
Making the world differentiable: On using fully recurrent
self-supervised neural networks for dynamic reinforcement learning and
planning in non-stationary environments.
Technical Report FKI-126-90, TUM, Feb 1990, revised Nov 1990.
PDF.
The first paper on planning with reinforcement learning recurrent neural networks (NNs) (more) and on generative adversarial networks
where a generator NN is fighting a predictor NN in a minimax game
(more).
1a.
J. Schmidhuber. Dynamische neuronale Netze und das fundamentale raumzeitliche Lernproblem (Dynamic neural nets and the fundamental spatio-temporal credit assignment problem). Dissertation, Institut fuer Informatik, Technische Universitaet Muenchen, 1990. PDF .
HTML.
Differences to Shannon / Boltzmann's notion of surprise.
Since the early 1990s, the papers above have repeatedly
pointed out an essential difference
between our theory of
surprise & novelty and Shannon's traditional information theory
based on Boltzmann's entropy notion.
Consider two extreme examples of uninteresting, unsurprising,
boring data.
A vision-based agent that always stays in the dark will experience
an extremely compressible, soon totally predictable and unsurprising
history of
unchanging visual inputs.
In front of a screen full of white noise
conveying a lot of information and "novelty" and "surprise"
in the traditional sense of Boltzmann (1800s) and Shannon (1948), however,
it will experience highly unpredictable
and fundamentally uncompressible data.
In both cases the data gets boring quickly as it does not
allow for learning new things or for further compression progress.
Neither the arbitrary nor the fully predictable is truly
novel or surprising or interesting - only
data with still unknown but learnable
statistical or algorithmic regularities are!
That's why our theory of surprise and curiosity and creativity takes
the time-varying state of the subjective, learning observer into account.
Check out related papers on adaptive visual attention with foveas (overview page):
J. Schmidhuber and R. Huber.
Learning to
generate artificial fovea trajectories for target detection.
International Journal of Neural Systems, 2(1 & 2):135-141, 1991.
Figures in overview page.
PDF .
HTML.
J. Schmidhuber and R. Huber.
Using sequential adaptive neuro-control for efficient learning of
rotation and translation invariance.
In T. Kohonen,
K. Mäkisara, O. Simula, and J. Kangas, editors,
Artificial Neural Networks, pages 315-320.
Elsevier Science Publishers B.V., North-Holland, 1991.
.