Next: MAXIMIZING CONSTRAINED OUTPUT VARIANCE Up: DISCOVERING PREDICTABLE CLASSIFICATIONS (Neural Previous: MOTIVATION AND BASIC APPROACH

ALTERNATIVE DEFINITIONS OF

This section lists four different approaches for defining

, the term which enforces non-trivial discriminative classifications. Section 2.1 presents a novel method that encourages locally represented classes (like with winner-take-all networks). The advantage of this method is that the class representations are orthogonal to each other and easy to understand, its disadvantage is the low representation capacity. In contrast, the remaining methods can generate distributed class representations. Section 2.2 defines

with the help of auto-encoders. One advantage of this straight-forward method is that it is easy to implement. A disadvantage is that predictable information conveyed by some input pattern does not necessarily help to minimize the reconstruction error of an auto-encoder (this holds for the stereo task, for instance). Section 2.3 mentions the Infomax approach for defining

and explains why we do not pursue this approach. Section 2.4 finally defines

by the recent method for predictability minimization [Schmidhuber, 1992]. An advantage of this method is its potential for creating distributed class representations with statistically independent components.

Subsections

Juergen Schmidhuber 2003-02-13