.
Jürgen Schmidhuber's
.
TU Munich Cogbotlab
Course (1 semester)

MACHINE LEARNING
& OPTIMIZATION I

biotech automation
WS 2004/2005: We are done! What you should know for the oral exams on MLBIO I on Feb 16 2005 (3 candiates so far) or Mar 17 (2 candidates so far): Bayes, ML, MAP, HMMs, Viterbi, EM, nonlinear nets, backprop, max margin, SVMs, recurrence: BPTT / RTRL / LSTM, differentiable world models, Q-learning, TD, POMDPs, hill-climbing, evolution, universal search, artificial ants, info theory basics, unsup. learning, factorial codes, SOMs

TUM link

General Overview. We focus on learning agents interacting with an initially unknown world. Since the world is dynamic, unlike many other machine learning courses ours will put strong emphasis on learning to deal with sequential data: we do not just want to learn reactive input / output mappings but programs (running, e.g., on recurrent neural nets) that perceive, classify, plan, make decisions, etc.

We minimize overlaps with other TUM courses related to machine learning and bio-inspired optimization. The lectures cover one semester and will eventually become a ``Wahlpflichtfach.''

SS 2005: The follow-up course ML & O II will be a ``Vertiefungsfach''. Related "Praktikum" and "Hauptseminar."

Course material. We often use the blackboard and ppt presentations. In the column below you will find links to supporting material.

Don't worry; you won't have to learn all of this! During the lectures we will explain what's really relevant for the oral exams at the end of the semester. But of course students are encouraged to read more than that!

Thanks to Andy Moore, Luca Gambardella, Marcus Hutter, Andy Ng for some of the material below.

neural net
Feedforward Neural Networks. Early NN research has focused on learning through gradient descent in feedforward NNs. We will only briefly discuss the essential concepts and limitations, also to avoid overlap with other TUM courses such as the Machine Learning and AI Praktikum, and then focus on world model builders, info theory and unsupervised learning.
1. Chapters 1, 3, 6, 7 of Beilage
2. Moore's
neural network slides
3. Papers on unsupervised learning
svm
Support Vector Machines. SVMs have largely replaced feedforward NNs in non-sequential classification tasks. We will briefly discuss the essential algorithms, minimizing overlap with other TUM courses such as Machine Learning in Bioinformatics.
1. Moore's
SVM slides
2. Ng's
SVM notes
rnn
Recurrent Neural Networks. RNNs can implement complex algorithms, as opposed to the reactive input / output mappings of feedforward nets and SVMs. We discuss gradient-based and evolutionary learning algorithms for RNNs.
1. Chapter 2 of Beilage
2. Recurrent networks tutorial, with focus on LSTM. PDF.
rnn
optimized flow; evolved
Evolutionary Computation. We discuss bio-inspired methods for evolving and optimizing solutions to problems defined by fitness functions, such as evolutionary strategies and genetic algorithms and adaptive grids, with applications to recurrent networks and program search.
1. Basic evolutionary algorithms: Chapter 8 of Beilage
2. Program Evolution and Genetic Programming
3. Adaptive Grids
gphelix
dicebot
Probabilities, HMMs, EM. Introduction / repetition: essential concepts of probability theory and statistics. Max Likelihood and MAP estimators. Hidden Markov Models, Viterbi Algorithm, Expectation Maximization.
1. Moore's elementary
Bayes intro and HMM intro
2. Overview: Statistical robotics, with outlook on theoretically optimal universal approach.
mouse
Traditional Reinforcement Learning. RL is about learning to maximize future reward. Most traditional RL research focuses on problems that are solvable by agents with reactive policies that do not need memories of previous observations. We discuss the most popular RL methods, their relation to dynamic programming, and their limitations.
1. Chapter 5 of Beilage
2. Moore's
RL slides
3. Papers on RL
ants
Artificial Ants. AAs are are multiagent optimizers that use local search techniques and communicate via artificial pheromones that evaporate over time. They achieve state-of-the-art performance in numerous optimization tasks.

1. Gambardella's
AA slides
2. Ant links