Learning Temporary Variable Binding

Next: Concluding Remarks Up: Experiments Previous: An Experiment With Unknown

Learning Temporary Variable Binding

Some researchers have claimed that neural nets are incapable of performing variable binding. Others, however, have argued for the potential usefulness of `dynamic links' (e.g. [v.d. Malsburg, 1981]), which may be useful for variable binding. With the fast-weight method, it is possible to train a system to use fast weights as dynamic links in order to temporarily bind variable contents to variable names (or `fillers' to `slots') as long as it is necessary for solving a particular task.

In the simple experiment described next, the system learns to remember where in a parking lot a car has been left. This involves binding a value in a variable that represents the car's location.

Neither nor needed hidden units for this task. The activation function of all output units was the identity function. All inputs to the system were binary, as were 's desired outputs. had one input unit which stood for the name of the variable WHERE-IS-MY-CAR?. In addition, had three output units for the names of three possible parking slots , , and (the possible answers to WHERE-IS-MY-CAR?). had three output units, one for each fast weight, and six input units. (Note that need not always have the same input as .) Three of the 6 input units were called the parking-slot detectors - , , . These detectors were activated for one time step when the car was parked in a given slot (while the other slot-detectors remained switched off). The three additional input units were randomly activated with binary values at each time step. These random activations served as distracting time varying inputs from the environment of a car owner whose life looks like this: He drives his car around for zero or more time steps (at each time step the probability that he stops driving is 0.25). Then he parks his car in one of three possible slots. Then he conducts business outside the car for zero or more time steps during which all parking-slot-detectors are switched off again (at each time step the probability that he finishes business is 0.25). Then he remembers where he has parked his car, goes to the corresponding slot, enters his car and starts driving again etc.

Our system focussed on the problem of remembering the position of the car. It was trained by activating the WHERE-IS-MY-CAR? unit at randomly chosen time steps and by providing the desired output for , which was the activation of the unit corresponding to the current slot , as long as the car was parked in one of the three slots.

The weights of were randomly initialized between -0.1 and 0.1. The task was considered to be solved if for 100 time steps in a row 's error did not exceed 0.05. The on-line version (without episode boundaries) was employed. With the weight-modification function (7), fast-weight changes based on (4), and $\eta =0.02$ the system learned to solve the task within 6000 time steps. As it was expected, learned to `bind' parking slot units to the WHERE-IS-MY-CAR?-unit by means of strong temporary fast-weight connections. Due to the local output representation, the binding patterns were easy to understand: At a given time there was a large fast weight on the connection leading from the WHERE-IS-MY-CAR?-unit to the appropriate parking slot unit (given the car was currently parked). The other fast-weights remained temporarily suppressed.

Next: Concluding Remarks Up: Experiments Previous: An Experiment With Unknown

Juergen Schmidhuber 2003-02-13

Back to Recurrent Neural Networks page