next up previous
Next: Learning Temporary Variable Binding Up: Experiments Previous: Experiments

An Experiment With Unknown Time Delays

In this experiment, the system was presented with a continuous stream of input events and $F$'s task was to switch on the single output unit the first time an event 'B' occurred following an event 'A'. At all other times, the output unit was to be switched off. This is the flip-flop task described in [Williams and Zipser, 1989].

One difficulty with this task is that there can be arbitrary time lags between relevant events. An additional difficulty is that no information about `episode boundaries' is given. The on-line method was employed: The activations of the networks were never reset. Thus, activations caused by events from past `episodes' could have a harmful effect on activations and weights in later episodes.

Both $F$ and $S$ had the topology of standard feedforward perceptrons. $F$ had 3 input units for 3 possible events `A', `B', and `C'. Events were represented in a local manner: At a given time, a randomly chosen input unit was activated with a value of 1.0, the others were de-activated. $F$'s output was one-dimensional. $S$ also had 3 input units for the possible events `A', `B', and `C', as well as 3 output units, one for each fast weight of $F$. Neither of the networks needed hidden units for this task. The activation function of all output units was the identity function. The weight-modification function (1) for the fast weights was given by

\begin{displaymath}
\sigma(w_{ab}(t-1), \Box w_{ab}(t)) =
(1 + e^{-T(w_{ab}(t-1)+ \Box w_{ab}(t) - 0.5) })^{-1}.
\end{displaymath} (7)

Here $T$ determines the maximal steepness of the logistic function used to bound the fast weights between 0 and 1.

The weights of $S$ were randomly initialized between -0.1 and 0.1. The task was considered to be solved if for 100 time steps in a row $F$'s error did not exceed 0.05. With fast-weight changes based on (4), $T=10$ and $\eta =1.0$ the system learned to solve the task within 300 time steps. With fast-weight changes based on the FROM/TO-architecture and (5), $T=10$ and $\eta =0.5$ the system learned to solve the task within 800 time steps.

The typical solution to this problem has the following properties: When an A-signal occurs, $S$ responds by producing a large weight on the $B$ input line of $F$ (which is otherwise small), thus enabling the $F$ network as a $B$ detector. When a $B$ signal occurs, $S$ `resets' $F$ by causing the weight on the $B$ line in $F$ to become small again, thereby making $F$ unresponsive to further $B$ signals until the next $A$ is received.


next up previous
Next: Learning Temporary Variable Binding Up: Experiments Previous: Experiments
Juergen Schmidhuber 2003-02-13

Back to Recurrent Neural Networks page