Compressed Network Search Finds Complex Neural Controllers with a Million Weights
First Deep Learner to learn control policies directly from highdimensional sensory input using reinforcement learning
Jürgen Schmidhuber, 2013
Many traditional methods of Evolutionary Computation [1519] can evolve problem solvers with hundreds of parameters, but not millions. Ours can [1,2], by greatly reducing the search space through evolving compact, compressed descriptions [38] of huge solvers. For example, a Recurrent Neural Network [3436] with over a million synapses or weights learned (without a teacher) to drive a simulated car based on a highdimensional videolike visual input stream. This was made possible through the efforts of my team members Dr Faustino Gomez, Dr Jan Koutnik, Giuseppe Cuccu, and Rupesh Kumar Srivastava.
In 1987, our first approach [20] towards scaling up Genetic Algorithms etc [1519] was to apply Genetic Programming (GP) [18,18a,19] to itself, to evolve even better GP methods [20]  a form of metaGP [20] or selfimprovement [37]. (Later others got interested in GP as well, e.g., [21].)
A somewhat less ambitious, but still rather general 1995 approach [9,10] was to compactly encode parameters of problem solvers, such as large neural networks (NN), through programs written in a universal programming language [2225]. Often it is much more efficient to systematically search the space of such programs with a bias towards short and fast programs [26,9,10,27], instead of directly searching the huge space of possible NN weight matrices.
Back in 1994, our universal [2225] language for encoding NN was assemblerlike [9,10]. In recent work, we replaced it by more practical languages [18] based on coefficients of popular transforms (Fourier, wavelet, discrete cosine, etc).
RNN are general computers which can map input sequences to output sequences [3436]. The weight matrix of an RNN can be viewed as its program. Look at the weight matrix as if it were an image. We may compress it by encoding it through the coefficients of a Fouriertype transform (here: the discrete cosine transform DCT) [18]. While a successful RNN controller may need hundreds of thousands of parameters (weights) to solve its task, its compressed description may need just a few hundred.
We evolve compact DCTbased descriptions using our awardwinning Natural Evolution Strategies (NES) [1113] or CoSynaptic NeuroEvolution (CoSyNE) [14]. Related work includes evolution of compact network encodings through graph rewriting [28], Lindenmeyer Systems [29,30], or HyperNEAT [31,31a].
Check out recent demonstrations of our approach on two Reinforcement Learning (RL) tasks (no teacher!) in which the control networks receive raw, videolike, highdimensional visual input streams: (1)
a visionbased version of the wellknown octopus control task requiring
networks with over 3 thousand weights, and (2) a version of
the TORCS driving game [21a] where networks with over 1 million
weights are evolved to drive a car around a track using
video images from the driver's perspective [1,2].
Note that the above goes beyond traditional Deep Learning [38], which is limited to pattern classification/detection/segmentation, but cannot learn to act.
DeepMind's more recent RL system
also can deal with certain types of highdimensional visual inputs.
References
[1]
J. Koutnik, G. Cuccu, J. Schmidhuber, F. Gomez.
Evolving LargeScale Neural Networks for VisionBased Reinforcement Learning.
In Proceedings of the Genetic and Evolutionary
Computation Conference (GECCO), Amsterdam, July 2013.
PDF.
[2]
J. Koutnik, G. Cuccu, J. Schmidhuber, F. Gomez.
Evolving LargeScale Neural Networks for VisionBased TORCS.
In Foundations of Digital Games (FDG), Chania, Crete, 2013.
PDF.
[3]
R. K. Srivastava, F. Gomez, J. Schmidhuber.
Generalized Compressed Network Search.
In C. Coello Coello, V. Cutello, K. Deb, S. Forrest, G. Nicosia, M. Pavone, eds.,
12th Int. Conf. on Parallel Problem Solving from Nature  PPSN XII,
Taormina, 2012.
PDF.
[4]
F. Gomez, J. Koutnik, J. Schmidhuber.
Compressed Network Complexity Search.
In C. Coello Coello, V. Cutello, K. Deb, S. Forrest, G. Nicosia, M. Pavone, eds.,
12th Int. Conf. on Parallel Problem Solving from Nature  PPSN XII,
Taormina, 2012. Nominated for best paper award.
PDF.
[5]
F. Gomez, J. Koutnik, J. Schmidhuber.
Complexity Search for Compressed Neural Networks.
Proc. GECCO 2012.
PDF.
[6]
R. K. Srivastava, J. Schmidhuber, F. Gomez.
Generalized Compressed Network Search.
Proc. GECCO 2012.
PDF.
[7]
J. Koutnik, F. Gomez, J. Schmidhuber (2010). Evolving Neural Networks in Compressed Weight Space. Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO2010), Portland, 2010.
PDF.
[8]
J. Koutnik, F. Gomez, J. Schmidhuber.
Searching for Minimal Neural Networks in Fourier Space.
The 3rd Conference on Artificial General Intelligence (AGI10), 2010.
PDF.
[9]
J. Schmidhuber.
Discovering solutions with low Kolmogorov complexity
and high generalization capability.
In A. Prieditis and S. Russell, editors, Machine Learning:
Proceedings of the Twelfth International Conference (ICML 1995),
pages 488496. Morgan
Kaufmann Publishers, San Francisco, CA, 1995.
PDF .
HTML.
[10]
J. Schmidhuber.
Discovering neural nets with low Kolmogorov complexity
and high generalization capability.
Neural Networks, 10(5):857873, 1997.
PDF.
HTML
[11]
T. Glasmachers, T. Schaul, Sun Yi, D. Wierstra, J. Schmidhuber.
Exponential Natural Evolution Strategies.
Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO2010), Portland, 2010. PDF.
Best paper award.
[12]
S. Yi, D. Wierstra, T. Schaul, J. Schmidhuber.
Efficient Natural Evolution Strategies.
Genetic and Evolutionary Computation Conference (GECCO09),
Montreal, 2009.
PDF. Best paper award.
[13]
Yi Sun, F. Gomez, T. Schaul, J. Schmidhuber.
A Linear Time Natural Evolution Strategy for NonSeparable Functions.
In Proceedings of the Genetic and Evolutionary
Computation Conference (GECCO), Amsterdam, 2013.
Preprint arXiv:1106.1998v2 (2011).
[14]
F. Gomez, J. Schmidhuber, R. Miikkulainen.
Accelerated Neural Evolution through
Cooperatively Coevolved Synapses.
Journal of Machine Learning Research (JMLR),
9:937965, 2008.
PDF.
[15] I. Rechenberg. Evolutionsstrategie  Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Dissertation, 1971.
[16] H. P. Schwefel. Numerische Optimierung von ComputerModellen. Dissertation, 1974.
[16a]
L. J. Fogel, A. J. Owens, and M. J. Walsh. Artificial Intelligence through Simulated Evolution. New York: John Wiley, 1966.
[17] J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, 1975.
[18] N. L. Cramer. A representation for the adaptive generation of simple sequential programs. In J. J. Grefenstette, editor, Proceedings of an International Conference on Genetic Algorithms and Their Applications, CarnegieMellon University, July 2426, 1985, Hillsdale NJ, 1985. Lawrence Erlbaum Associates.
[18a] S. F. Smith. A Learning System Based on Genetic Adaptive Algorithms, PhD Thesis, Univ. Pittsburgh, 1980
[19] D. Dickmanns, J. Schmidhuber, and A. Winklhofer. Genetic Programming Implementation in Prolog, TUM, 1987. HTML.
[20] J. Schmidhuber. Evolutionary principles in selfreferential learning. Diploma thesis on MetaGP etc, TUM, 1987. HTML.
[21] J. R. Koza. Genetic Programming  On the Programming of Computers by Means of Natural Selection. MIT Press, 1992.
[21a]
E. Espie, C. Guionneau, B. Wymann, C. Dimitrakakis, R. Coulom, A. Sumner.
TORCS  The Open Racing Car Simulator, v1.3.1, 2008.
See TORCS credits.
[22] K. Goedel. Ueber formal unentscheidbare Saetze der Principia Mathematica und verwandter Systeme I. Monatshefte fuer Mathematik und Physik, 38:173198, 1931.
[23] A. Church. An unsolvable problem of elementary number theory. American Journal of Mathematics, 58:345363, 1936.
[24] A. M. Turing. On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, Series 2, 41:230267, 1936.
[25] E. L. Post. Finite combinatory processesformulation 1. The Journal of Symbolic Logic, 1(3):103105, 1936.
[26] L. A. Levin. Universal sequential search problems. Problems of Information Transmission, 9(3):265266, 1973.
[27] J. Schmidhuber. Optimal ordered problem solver. Machine Learning, 54:211254, 2004.
PDF.
HTML.
HTML overview.
[28] H. Kitano. Designing neural networks using genetic algorithms with graph generation system. Complex Systems, 4:461476, 1990.
[29] A. Lindenmayer: Mathematical models for cellular interaction in development. In: J. Theoret. Biology. 18. 1968, 280315.
[30] C. Jacob , A. Lindenmayer , G. Rozenberg. Genetic LSystem Programming, Parallel Problem Solving from Nature III, Lecture Notes in Computer Science, 1994
[31] D. B. D'Ambrosio and K. O. Stanley. A novel generative encoding for exploiting neural network sensor and output geometry. Proc. GECCO, p 974981, New York, NY, USA, 2007.
[31a] K. O. Stanley, D. B. D'Ambrosio, J. Gauci. A HypercubeBased
Encoding for Evolving LargeScale Neural Networks. Artificial Life
Journal 15(2), p 185212. Cambridge, MA: MIT Press, 2009
[32] J.S. Evolutionary Computation overview
[33] J.S. Reinforcement Learning (RL) overview
[34] J.S. Recurrent Neural Network overview
[35] J.S. RNN book preface
[36] J.S. Coevolving neurons
[37] J.S. Metalearning overview
[38] J.S.
Deep Learning since 1991
.
