My current postdocs are Jieyu Zhao, Martin Eldracher (temporarily), and Nic Schraudolph. My current PhD students are Sepp Hochreiter, Marco Wiering, and Rafal Salustowicz. So far, my most important contributions are (see online publications for more details):
(1)
Reinforcement learning with self-modifying policies (SMPs).
An SMP's probabilistic learning algorithm
is part of the SMP itself - SMPs can modify the way
they modify themselves.
The "success-story algorithm" (SSA)
forces stochastic SMPs to trigger better and better self-modifications.
SMP/SSA-based learners can solve complex tasks in partially observable
environments (POEs). With Marco Wiering and Jieyu Zhao.
(2)
``Predictability minimization'' (PM),
a co-evolutionary, unsupervised learning algorithm
based on neural feature detectors
and predictors that fight each other (1992 - ).
PM has various potential advantages over other
``neural'' methods for redundancy reduction.
When applied to image data, PM
automatically comes up with
feature detectors reminiscent of those in biological systems (such as
orientation sensitive edge detectors, on-center-off-surround detectors,
bar detectors). With Martin Eldracher, Daniel Prelinger, Bernhard Foltin, Nic
Schraudolph.
(3)
Various novel learning algorithms
for improving the performance of
recurrent neural nets with time-varying inputs - with focus on the
long time lag problem (1990 - ):
(3.1) History compression (1991 - 1994). Exploits regularities in partly predictable symbol strings to accelerate sequence classification with recurrent nets. An implementation based on hierarchical recurrent nets easily outperforms previous approaches when it comes to learning regular grammars from symbol sequences with long time lags between occurrences of relevant symbols (but there have to be local regularities). Conceptually related to predictability minimization (see above).
(3.2) Fast gradient descent with fixed storage (1991). The standard algorithm for fully recurrent continually running networks (RTRL) requires O(n^4) computations per time step, where $n$ is the number of non-input units. The new method (which duplicates work by Ron Williams) computes exactly the same gradient and requires fixed-size storage of the same order but has an average time complexity per time step of O(n^3).
(3.3) Long Short-Term Memory (with Sepp Hochreiter, 1995). A novel recurrent net algorithm without many drawbacks of previous approaches: LSTM can learn to bridge very long time lags by enforcing constant error flow back through time. In experimental comparisons with competing approaches, LSTM leads to many more successful runs, and learns much faster.
(4)
Minimum description length-based
methods for finding ``simple''
neural nets with low
algorithmic complexity and high
generalization capability (1994 - ).
(4.1) Generalization and Kolmogorov complexity. A derivate of Levin's universal search algorithm can be used to discover neural nets with low Levin complexity, low Kolmogorov complexity, and high generalization capability. At least with certain toy problems where it is computationally feasible, the method can lead to generalization results unmatchable by traditional neural net algorithms.
(4.2) Flat minimum search (with Sepp Hochreiter). A ``minimum description length''-based argument shows that flat minima of typical error functions correspond to low expected overfitting/high generalization. In applications to stock market prediction, flat minimum search outperforms other widely used competitors.
(5) Other contributions include:
(5.1) Methods translating mismatches between reality and (confidence-based) expectations into reinforcement for ``curious'', exploring agents (1990 - 1995). Conceptually related to predictability minimization (see above).
(5.2) Methods for reinforcement learning in partially observable environments (1990 - ).
(5.3) The Neural Bucket Brigade algorithm (1989).
(5.4) The ``Neural Heat Exchanger'' (1990) - a supervised variant of Hinton and Dayan's 1994 ``Helmholtz machine''.
(5.5) Adaptive neural subgoal generators. With Rainer Wahnsiedler.
(5.6) Networks using fast weights to handle time-varying inputs.
(5.7) A hybrid data compression method combining predictive neural nets and traditional statistical coding techniques. It can sometimes outperform the widely used Lempel-Ziv text compression algorithm. With Stefan Heil.
(5.8) A neural system for active vision. It learns to sequentially shift the focus of attention, to detect targets in visual scenes. With Rudolf Huber.
(5.9) Genetic Programming (1987, with Dickmanns and Winklhofer). To our knowledge, we wrote the first paper on using genetic algorithms to evolve assembler-like computer programs. We applied our system to simple tasks, including the ``lawnmower problem'' (later also studied by Koza, 1994).
(5.10) Other things. Go to online publications for more details.
(6) I am also active as an artist. In 1994
I introduced the concept of ``Low-Complexity Art''.
Low-complexity art is the computer age equivalent of
minimal art. It is art with low Kolmogorov complexity -
art that can be generated by a short algorithm.
See the
Low-Complexity Art page
and
online publications
for more.