Compressed Network Search Finds Complex Neural Controllers with a Million Weights First Deep Learner to learn control policies directly from high-dimensional sensory input using reinforcement learning Jürgen Schmidhuber, 2013 Many traditional methods of Evolutionary Computation [15-19] can evolve problem solvers with hundreds of parameters, but not millions. Ours can [1,2], by greatly reducing the search space through evolving compact, compressed descriptions [3-8] of huge solvers. For example, a Recurrent Neural Network [34-36] with over a million synapses or weights learned (without a teacher) to drive a simulated car based on a high-dimensional video-like visual input stream. This was made possible through the efforts of my team members Dr Faustino Gomez, Dr Jan Koutnik, Giuseppe Cuccu, and Rupesh Kumar Srivastava. In 1987, our first approach [20] towards scaling up Genetic Algorithms etc [15-19] was to apply Genetic Programming (GP) [18,18a,19] to itself, to evolve even better GP methods [20] - a form of meta-GP [20] or self-improvement [37]. (Later others got interested in GP as well, e.g., [21].) A somewhat less ambitious, but still rather general 1995 approach [9,10] was to compactly encode parameters of problem solvers, such as large neural networks (NN), through programs written in a universal programming language [22-25]. Often it is much more efficient to systematically search the space of such programs with a bias towards short and fast programs [26,9,10,27], instead of directly searching the huge space of possible NN weight matrices. Back in 1994, our universal [22-25] language for encoding NN was assembler-like [9,10]. In recent work, we replaced it by more practical languages [1-8] based on coefficients of popular transforms (Fourier, wavelet, discrete cosine, etc). RNN are general computers which can map input sequences to output sequences [34-36]. The weight matrix of an RNN can be viewed as its program. Look at the weight matrix as if it were an image. We may compress it by encoding it through the coefficients of a Fourier-type transform (here: the discrete cosine transform DCT) [1-8]. While a successful RNN controller may need hundreds of thousands of parameters (weights) to solve its task, its compressed description may need just a few hundred. We evolve compact DCT-based descriptions using our award-winning Natural Evolution Strategies (NES) [11-13] or Co-Synaptic Neuro-Evolution (CoSyNE) [14]. Related work includes evolution of compact network encodings through graph rewriting [28], Lindenmeyer Systems [29,30], or HyperNEAT [31,31a]. Check out recent demonstrations of our approach on two Reinforcement Learning (RL) tasks (no teacher!) in which the control networks receive raw, video-like, high-dimensional visual input streams: (1) a vision-based version of the well-known octopus control task requiring networks with over 3 thousand weights, and (2) a version of the TORCS driving game [21a] where networks with over 1 million weights are evolved to drive a car around a track using video images from the driver's perspective [1,2]. Note that the above goes beyond traditional Deep Learning [38], which is limited to pattern classification/detection/segmentation, but cannot learn to act. DeepMind's more recent RL system also can deal with certain types of high-dimensional visual inputs. References [1] J. Koutnik, G. Cuccu, J. Schmidhuber, F. Gomez. Evolving Large-Scale Neural Networks for Vision-Based Reinforcement Learning. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), Amsterdam, July 2013. PDF. [2] J. Koutnik, G. Cuccu, J. Schmidhuber, F. Gomez. Evolving Large-Scale Neural Networks for Vision-Based TORCS. In Foundations of Digital Games (FDG), Chania, Crete, 2013. PDF. [3] R. K. Srivastava, F. Gomez, J. Schmidhuber. Generalized Compressed Network Search. In C. Coello Coello, V. Cutello, K. Deb, S. Forrest, G. Nicosia, M. Pavone, eds., 12th Int. Conf. on Parallel Problem Solving from Nature - PPSN XII, Taormina, 2012. PDF. [4] F. Gomez, J. Koutnik, J. Schmidhuber. Compressed Network Complexity Search. In C. Coello Coello, V. Cutello, K. Deb, S. Forrest, G. Nicosia, M. Pavone, eds., 12th Int. Conf. on Parallel Problem Solving from Nature - PPSN XII, Taormina, 2012. Nominated for best paper award. PDF. [5] F. Gomez, J. Koutnik, J. Schmidhuber. Complexity Search for Compressed Neural Networks. Proc. GECCO 2012. PDF. [6] R. K. Srivastava, J. Schmidhuber, F. Gomez. Generalized Compressed Network Search. Proc. GECCO 2012. PDF. [7] J. Koutnik, F. Gomez, J. Schmidhuber (2010). Evolving Neural Networks in Compressed Weight Space. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2010), Portland, 2010. PDF. [8] J. Koutnik, F. Gomez, J. Schmidhuber. Searching for Minimal Neural Networks in Fourier Space. The 3rd Conference on Artificial General Intelligence (AGI-10), 2010. PDF. [9] J.  Schmidhuber. Discovering solutions with low Kolmogorov complexity and high generalization capability. In A. Prieditis and S. Russell, editors, Machine Learning: Proceedings of the Twelfth International Conference (ICML 1995), pages 488-496. Morgan Kaufmann Publishers, San Francisco, CA, 1995. PDF . HTML. [10] J. Schmidhuber. Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 10(5):857-873, 1997. PDF. HTML [11] T. Glasmachers, T. Schaul, Sun Yi, D. Wierstra, J. Schmidhuber. Exponential Natural Evolution Strategies. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2010), Portland, 2010. PDF. Best paper award. [12] S. Yi, D. Wierstra, T. Schaul, J. Schmidhuber. Efficient Natural Evolution Strategies. Genetic and Evolutionary Computation Conference (GECCO-09), Montreal, 2009. PDF. Best paper award. [13] Yi Sun, F. Gomez, T. Schaul, J. Schmidhuber. A Linear Time Natural Evolution Strategy for Non-Separable Functions. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), Amsterdam, 2013. Preprint arXiv:1106.1998v2 (2011). [14] F. Gomez, J. Schmidhuber, R. Miikkulainen. Accelerated Neural Evolution through Cooperatively Coevolved Synapses. Journal of Machine Learning Research (JMLR), 9:937-965, 2008. PDF. [15] I. Rechenberg. Evolutionsstrategie - Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Dissertation, 1971. [16] H. P. Schwefel. Numerische Optimierung von Computer-Modellen. Dissertation, 1974. [16a] L. J. Fogel, A. J. Owens, and M. J. Walsh. Artificial Intelligence through Simulated Evolution. New York: John Wiley, 1966. [17] J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, 1975. [18] N. L. Cramer. A representation for the adaptive generation of simple sequential programs. In J. J. Grefenstette, editor, Proceedings of an International Conference on Genetic Algorithms and Their Applications, Carnegie-Mellon University, July 24-26, 1985, Hillsdale NJ, 1985. Lawrence Erlbaum Associates. [18a] S. F. Smith. A Learning System Based on Genetic Adaptive Algorithms, PhD Thesis, Univ. Pittsburgh, 1980 [19] D. Dickmanns, J. Schmidhuber, and A. Winklhofer. Genetic Programming Implementation in Prolog, TUM, 1987. HTML. [20] J. Schmidhuber. Evolutionary principles in self-referential learning. Diploma thesis on Meta-GP etc, TUM, 1987. HTML. [21] J. R. Koza. Genetic Programming - On the Programming of Computers by Means of Natural Selection. MIT Press, 1992. [21a] E. Espie, C. Guionneau, B. Wymann, C. Dimitrakakis, R. Coulom, A. Sumner. TORCS - The Open Racing Car Simulator, v1.3.1, 2008. See TORCS credits. [22] K. Goedel. Ueber formal unentscheidbare Saetze der Principia Mathematica und verwandter Systeme I. Monatshefte fuer Mathematik und Physik, 38:173-198, 1931. [23] A. Church. An unsolvable problem of elementary number theory. American Journal of Mathematics, 58:345-363, 1936. [24] A. M. Turing. On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, Series 2, 41:230-267, 1936. [25] E. L. Post. Finite combinatory processes-formulation 1. The Journal of Symbolic Logic, 1(3):103-105, 1936. [26] L. A. Levin. Universal sequential search problems. Problems of Information Transmission, 9(3):265-266, 1973. [27] J. Schmidhuber. Optimal ordered problem solver. Machine Learning, 54:211-254, 2004. PDF. HTML. HTML overview. [28] H. Kitano. Designing neural networks using genetic algorithms with graph generation system. Complex Systems, 4:461-476, 1990. [29] A. Lindenmayer: Mathematical models for cellular interaction in development. In: J. Theoret. Biology. 18. 1968, 280-315. [30] C. Jacob , A. Lindenmayer , G. Rozenberg. Genetic L-System Programming, Parallel Problem Solving from Nature III, Lecture Notes in Computer Science, 1994 [31] D. B. D'Ambrosio and K. O. Stanley. A novel generative encoding for exploiting neural network sensor and output geometry. Proc. GECCO, p 974-981, New York, NY, USA, 2007. [31a] K. O. Stanley, D. B. D'Ambrosio, J. Gauci. A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks. Artificial Life Journal 15(2), p 185-212. Cambridge, MA: MIT Press, 2009 [32] J.S. Evolutionary Computation overview [33] J.S. Reinforcement Learning (RL) overview [34] J.S. Recurrent Neural Network overview [35] J.S. RNN book preface [36] J.S. Co-evolving neurons [37] J.S. Metalearning overview [38] J.S. Deep Learning since 1991 .