** Next:** MAXIMIZING CONSTRAINED OUTPUT VARIANCE
** Up:** DISCOVERING PREDICTABLE CLASSIFICATIONS (Neural
** Previous:** MOTIVATION AND BASIC APPROACH

This section lists four *different* approaches for
defining , the term which enforces
non-trivial discriminative classifications.
Section 2.1 presents a novel method that encourages
locally represented classes
(like with winner-take-all networks). The advantage of this
method is that the class representations are orthogonal to each other
and easy to understand,
its disadvantage is the low representation capacity. In contrast,
the remaining methods can generate distributed class representations.
Section 2.2 defines with the help of auto-encoders.
One advantage of this straight-forward method is that it
is easy to implement. A disadvantage is that predictable
information conveyed by some input pattern does not necessarily
help to minimize the reconstruction error of an auto-encoder
(this holds for the stereo task, for instance).
Section 2.3 mentions the Infomax approach for defining and
explains why we do not pursue this approach.
Section 2.4 finally defines by the recent method for
*predictability minimization* [Schmidhuber, 1992].
An advantage of this method is its potential for creating
distributed class
representations with statistically independent components.

**Subsections**

Juergen Schmidhuber
2003-02-13