next up previous
Next: OUTLOOK Up: EXAMPLE 3: Discovering Factorial Previous: PREDICTABILITY MINIMIZATION

SIMULATIONS: IMAGE CODING

In collaboration with Stefanie Lindstädt, the method was applied to a pattern ensemble consisting of 80 characters from the font-courier DEC-dataset. Each character was represented by an image made of $10 \times 15$ binary pixels. Since there are only 80 characters but $2^{150}$ possible patterns that can be represented by 150 pixels, the training set contains an enormous amount of redundant information.

During training, the images were randomly presented according to the probabilities of English language. The unsupervised system had 150 input units, 16 code units, and 1 ``bias'' unit. Each predictor had 15 input units, 1 ``bias'' unit, and 1 output unit. The learning rate of the predictors was 10 times as high as the learning rate of the code units. Within 10000 pattern presentations, the system often learned to generate a loss-free code of the ensemble such that the code was much less redundant than the original data. The redundancy (see the definition in section 1.2) corresponding to the original DEC dataset is $13.41$. The redundancy corresponding to a 16-bit code discovered by the system is $2.5$. See [14], [13], and [24] for details.

This result corresponds to a dramatic reduction of redundant information, although the achieved value is not optimal. In many realistic cases, however, approximations of nonredundant codes should be satisfactory. It is intended to apply the method to the problem of unsupervised segmentation of real world images. See [30] for an application to simple stereo vision.

One might speculate about whether the brain uses a similar principle based on ``code neurons'' trying to escape the predictions of ``predictor neurons''.


next up previous
Next: OUTLOOK Up: EXAMPLE 3: Discovering Factorial Previous: PREDICTABILITY MINIMIZATION
Juergen Schmidhuber 2003-02-19