LEARNING ROBOTS / ROBOT LEARNING

Scroll down for papers & videos. Below: our RNN- controlled surgery bot ties a knot

Jürgen Schmidhuber's page on

Pybrain Machine Learning Library for Robot Learning

LEARNING
ROBOTS

in partially observable environments

. Check
out our
current
robots!
Last
update
2013

Cognitive Robotics at Schmidhuber's former TU Munich Cogbotlab

Our collaborators also include the numerous robot labs at TUM, DLR, UniBW, ITM. IDSIA also participated in many EU robot projects such as the SWARMBOT project - compare the EU exystence ad (pdf). With CSEM we are working on attentive sensing and robot learning with hierarchical control strategies. New IDSIA projects on developmental robotics with adaptive humanoids and artificial hands with elastic muscles started in 2009.

Some hardwired, pre-programmed robots such as BU Munich's fast robot car (1994) and TU Munich's humanoid walking biped perform impressive tasks. But they do not learn like humans do.

So how can we make them learn from experience? Unfortunately, traditional reinforcement learning algorithms are limited to simple reactive behavior and do not work well for realistic robots.

Robot learning in realistic environments requires novel algorithms for learning to identify important events in the stream of sensory inputs, and to temporarily memorize them in adaptive, dynamic, internal states until the memories can help to compute proper control actions.

We believe that among the most promising approaches for learning to memorize are our recurrent neural networks , policy gradients, and the Optimal Ordered Problem Solver. An ambitious long-term goal is to implement a full-fledged Gödel machine for a real learning robot. Research topics of our CoTeSys group included: Artificial curiosity and creativity for the DLR artificial hands and the iCub baby robot, behavior evolution for AM's 180cm walking biped, visual attention & unsupervised learning & sequence learning for adaptive mobile robots and robot cars, safety for humans interacting with learning robots.

Check out IDSIA's robots at the EXPO21xx show room - see also Video on humanoid research with iCub baby robot by the IDSIA Robotics Lab. Our current robots are here.
.

We are studying not only real robots but also virtual ones, living in 3- dimensional video game-like worlds with rather realistic simulated physics. We are also interested in non-wheeled learning robots, such as artificial snakes. Their navigation problems are harder than those of wheeled ones. On the other hand, they can deal with rough terrain.

Our Pybrain Machine Learning Library features source code of many algorithms for robot learning. See Pybrain video and IDSIA's iCub video with applications to adaptive robotics.

Video on humanoid research with iCub baby robot in Juergen Schmidhuber's lab

SSA learns a complex task involving two agents and two keys

Master in Intelligent Systems at the University of Lugano, Switzerland, in collaboration with IDSIA

Right: Our collaborator Alexander Gloye-Foerster (IDSIA) led the FU-Fighters team that became robocup world champion 2004 in the fastest league (robot speed up to 5m/s).

The robocup robots plan ahead with neural nets, implementig ideas first outlined in J. Schmidhuber: An on-line algorithm for dynamic reinforcement learning and planning in reactive environments. In Proc. IEEE/INNS IJCNN, San Diego, vol. 2, p. 253-258, 1990. PS.GZ. Compare long report (1990, PDF).

More on the robocup in the page of Alexander, creator of the first self- healing robots with self- models

Above: iCub baby robot as used in JS' EU project IM-CLEVER on developmental robotics and on implementing the theory of creativity & curiosity & novelty & interestingness on robots.

German home

Fibonacci web design
by J. Schmidhuber

Selected Publications on Robot Learning / Soccer Learning

34. M. Stollenga, L. Pape, M. Frank, J. Leitner, A. Förster, J. Schmidhuber. Task-Relevant Roadmaps: A Framework for Humanoid Motion Planning. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, Japan, 2013. PDF.

33. H. Ngo, M. Luciw, A. Förster, J. Schmidhuber. Confidence-based progress-driven self-generated goals for skill acquisition in developmental robots. Frontiers in Psychology, 2013. doi: 10.3389/fpsyg.2013.00833

32. J. Leitner, S. Harding, M. Frank, A. Förster, J. Schmidhuber. Artificial Neural Networks For Spatial Perception: Towards Visual Object Localisation in Humanoid Robots. International Joint Conference on Neural Networks (IJCNN), Dallas, USA, 2013. PDF.

31. J. Leitner, S. Harding, M. Frank, A. Förster, J. Schmidhuber. Humanoid Learns to Detect Its Own Hands. IEEE Congress on Evolutionary Computing (CEC), Cancun, Mexico, 2013. PDF.

30. J. Koutnik, G. Cuccu, J. Schmidhuber, F. Gomez. Evolving Large-Scale Neural Networks for Vision-Based Reinforcement Learning. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), Amsterdam, 2013. PDF.

29. M. Frank, J. Leitner, M. Stollenga, G. Kaufmann, S. Harding, A. Förster, J. Schmidhuber. The Modular Behavioral Environment for Humanoids & other Robots (MoBeE). 9th International Conference on Informatics in Control, Automation and Robotics (ICINCO). Rome, Italy, 2012. PDF.

28. V. R. Kompella, M. Luciw, M. Stollenga, L. Pape, J. Schmidhuber. Autonomous Learning of Abstractions using Curiosity-Driven Modular Incremental Slow Feature Analysis. Proc. IEEE Conference on Development and Learning / EpiRob 2012 (ICDL-EpiRob'12), San Diego, 2012.

27. J. Leitner, S. Harding, M. Frank, A. Foerster, J. Schmidhuber. Transferring Spatial Perception Between Robots Operating In A Shared Workspace. Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'12), Vilamoura, 2012.

26. J. Leitner, P. Chandrashekhariah, S. Harding, M. Frank, G. Spina, A. Foerster, J. Triesch, J. Schmidhuber. Autonomous Learning Of Robust Visual Object Detection And Identification On A Humanoid. Proc. IEEE Conference on Development and Learning / EpiRob 2012 (ICDL-EpiRob'12), San Diego, 2012.

25. L. Pape, C. M. Oddo, M. Controzzi, C. Cipriani, A. Foerster, M. C. Carrozza, J. Schmidhuber. Learning tactile skills through curious exploration. Frontiers in Neurorobotics 6:6, 2012, doi: 10.3389/fnbot.2012.00006

24. H. Ngo, M. Luciw, A. Foerster, J. Schmidhuber. Learning Skills from Play: Artificial Curiosity on a Katana Robot Arm. Proc. IJCNN 2012.

23. V. R. Kompella, L. Pape, J. Masci, M. Frank and J. Schmidhuber. AutoIncSFA and Vision-based Developmental Learning for Humanoid Robots. 11th IEEE-RAS International Conference on Humanoid Robots (Humanoids), Bled, Slovenia, 2011.

22. M. Frank, A. Förster, J. Schmidhuber. Reflexive Collision Response with Virtual Skin. International Conference on Agents and Artificial Intelligence ICAART 2012.

21. T. Schaul, J. Bayer, D. Wierstra, S. Yi, M. Felder, F. Sehnke, T. Rückstiess, J. Schmidhuber. PyBrain. Journal of Machine Learning Research (JMLR), 11:743-746, 2010. PDF. (See Pybrain video.)

20. T. Rückstiess, F. Sehnke, T. Schaul, D. Wierstra, S. Yi, J. Schmidhuber. Exploring Parameter Space in Reinforcement Learning. Paladyn Journal of Behavioral Robotics, 2010. PDF.

19. F. Sehnke, C. Osendorfer, T. Rückstiess, A. Graves, J. Peters, J. Schmidhuber. Parameter-exploring policy gradients. Neural Networks 23(2), 2010. PDF.

18. H. Mayer, F. Gomez, D. Wierstra, I. Nagy, A. Knoll, and J. Schmidhuber. A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks. Advanced Robotics, 22/13-14, p. 1521-1537, 2008.

17. J. Schmidhuber: Prototype resilient, self-modeling robots. Science 316, no. 5825 p 688, May 2007.

16. F. Gomez, J. Schmidhuber, and R. Miikkulainen (2006). Efficient Non-Linear Control through Neuroevolution. Proceedings of the European Conference on Machine Learning (ECML-06, Berlin). PDF. A new, general method that outperforms many others on difficult control tasks.

15. H. Mayer, F. Gomez, D. Wierstra, I. Nagy, A. Knoll, and J. Schmidhuber (2006). A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks. Proceedings of the International Conference on Intelligent Robotics and Systems (IROS-06, Beijing). PDF. (Best paper nomination finalist.)

14. J. Schmidhuber. Developmental Robotics, Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts. Connection Science, 18(2): 173-187, June 2006. PDF.

13. B. Bakker, V. Zhumatiy, G. Gruener, J. Schmidhuber. Quasi-Online Reinforcement Learning for Robots. Proceedings of the International Conference on Robotics and Automation (ICRA-06), Orlando, Florida, 2006. PDF. A reinforcement learning vision-based robot that learns to build a simple model of the world and itself. To figure out how to achieve rewards in the real world, it performs numerous `mental' experiments using the adaptive world model.

12. V. Zhumatiy, F. Gomez, M. Hutter, and J. Schmidhuber. Metric State Space Reinforcement Learning for a Vision-Capable Mobile Robot. In Proceedings of the International Conference on Intelligent Autonomous Systems, IAS-06, Tokyo, 2006. PDF.

11. F. J. Gomez and J. Schmidhuber. Evolving modular fast-weight networks for control. In W. Duch et al. (Eds.): Proc. Intl. Conf. on Artificial Neural Networks ICANN'05, LNCS 3697, pp. 383-389, Springer-Verlag Berlin Heidelberg, 2005. Featuring a 3-wheeled reinforcement learning robot with holonomic drive and distance sensors. Without a teacher it learns to balance a jointed pole indefinitely in a simulated 3D environment confined by walls. PDF. HTML overview.

10. Schmidhuber, J., Zhumatiy, V. and Gagliolo, M. Bias-Optimal Incremental Learning of Control Sequences for Virtual Robots. In Groen, F., Amato, N., Bonarini, A., Yoshida, E., and Kröse, B., editors: Proceedings of the 8-th conference on Intelligent Autonomous Systems, IAS-8, Amsterdam, The Netherlands, pp. 658-665, 2004. PDF .

9. B. Bakker and J. Schmidhuber. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization (PDF). In F. Groen, N. Amato, A. Bonarini, E. Yoshida, and B. Kröse (Eds.), Proceedings of the 8-th Conference on Intelligent Autonomous Systems, IAS-8, Amsterdam, The Netherlands, p. 438-445, 2004.

8. B. Bakker, V. Zhumatiy, G. Gruener, and J. Schmidhuber. A Robot that Reinforcement-Learns to Identify and Memorize Important Previous Observations (PDF). In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2003.

7. B. Bakker, F. Linaker, J. Schmidhuber. Reinforcement Learning in Partially Observable Mobile Robot Domains Using Unsupervised Event Extraction. In Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2002), Lausanne, 2002. PDF .

6. B. Bakker. Reinforcement Learning with Long Short-Term Memory. Advances in Neural Information Processing Systems 13 (NIPS'13), 2002. (On J. Schmidhuber's CSEM grant 2002.)

5. M. Wiering, R. Salustowicz, J. Schmidhuber. Model-based reinforcement learning for evolving soccer strategies. In Computational Intelligence in Games, chapter 5. Editors N. Baba and L. Jain. pp. 99-131, 2001. PDF .

4. M. Wiering, R. Salustowicz, J. Schmidhuber. Reinforcement learning soccer teams with incomplete world models. Journal of Autonomous Robots, 7(1):77-88, 1999. PDF .

3. R. Salustowicz and M. Wiering and J. Schmidhuber. Learning team strategies: soccer case studies. Machine Learning 33(2/3), 263-282, 1998 (127 K). PDF .

2. R. Salustowicz and M. Wiering and J. Schmidhuber. Evolving soccer strategies. In N. Kasabov, R. Kozma, K. Ko, R. O'Shea, G. Coghill, and T. Gedeon, editors, Progress in Connectionist-based Information Systems: Proceedings of the Fourth International Conference on Neural Information Processing ICONIP'97, volume 1, pages 502-505, 1997.

1. R. Salustowicz and M. Wiering and J. Schmidhuber. On learning soccer strategies. In W. Gerstner, A. Germond, M. Hasler, J.-D. Nicoud, eds., Proceedings of the International Conference on Artificial Neural Networks, Lausanne, Switzerland, Springer, 769-774, 1997.

STIFF - EU research project on enhancing biomorphic agility of robot arms and hands through variable stiffness & elasticity

2009: 10 new jobs in Schmidhuber's Robot Learning Lab at IDSIA

Resilient machine with Continuous Self-Modeling

Computer Vision with Fast Deep Neural Nets Etc Yield Best Results on Many Visual Pattern Recognition Benchmarks

2011: First Superhuman Visual Pattern Recognition

My first Deep Learner of 1991 + Deep Learning timeline 1962-2013

TU Munich's walking biped (Pfeiffer, Ulbrich)

TU Munich's pioneering walking biped (Pfeiffer & Ulbrich et al., 1990s-2005): 180cm, stereo vision guidance by LSR. Our CoTeSys group worked on evolving its behavior.

Right: AAAI 2013 Best Student Video Award (Seattle, 2013) for the video on roadmap planning for an iCub humanoid robot, also on YouTube. This Shakey-winning video was made possible through Marijn Stollenga & Kail Frank & Juxi Leitner & Leo Pape & Alexander Foerster & Jan Koutnik in the group of JS.

AAAI 2013 Best Student Video Award (Seattle, 2013) for our video on roadmap planning for an iCub humanoid robot

See also our other iCub baby robot video, and visit the IDSIA Robotics Lab