next up previous
Next: FINDING PREDICTABLE DISTRIBUTED REPRESENTATIONS Up: ILLUSTRATIVE EXPERIMENTS Previous: FINDING PREDICTABLE LOCAL CLASS

STEREO TASK

The binary stereo experiment described in [Becker and Hinton, 1989] (see also example 2 in section 1) served to compare IMAX to our approach.

Becker and Hinton report that their system (based on binary probabilistic units) was able to extract the `shift' between two simple stereoscopical binary images only if IMAX was applied in successive `layer by layer' bootstrap stages. In addition, they heuristically tuned the learning rate during learning. Finally they introduced a maximal weight change for each weight during gradient ascent.

In contrast, the method described herein (based on continuous-valued units) does not rely on successive bootstrap stages or any other heuristic considerations.

We minimized (4) with $D_l$ defined by predictability minimization according to (9).

With a first experiment, we employed a different set of weights for each network. With ten test runs involving 100,000 training patterns the networks always learned to extract the stereoscopical shift. This performance of our non-bootstrapped system is comparable to the performance of Becker's and Hinton's bootstrapped system.

With a second experiment, we used only one set of weights for both networks (this leads to a reduction of free parameters). The result was a significant decrease of learning time - with ten test runs the system needed between 20,000 and 50,000 training patterns to learn to extract the shift.


next up previous
Next: FINDING PREDICTABLE DISTRIBUTED REPRESENTATIONS Up: ILLUSTRATIVE EXPERIMENTS Previous: FINDING PREDICTABLE LOCAL CLASS
Juergen Schmidhuber 2003-02-13