**Jürgen Schmidhuber
IDSIA, Switzerland
**

The components of most real-world patterns contain redundant
information. However, most pattern classifiers (e.g., statistical
classifiers and neural nets) work better if pattern
components are nonredundant. I present various
unsupervised nonlinear predictor-based ``neural''
learning algorithms that transform patterns and pattern
sequences into less redundant patterns without loss of
information. The first part of the paper shows how a neural predictor
can be used to remove redundant information
from input sequences.
Experiments with artificial sequences
demonstrate that certain supervised
classification techniques can greatly benefit from this kind of
unsupervised preprocessing.
In the second part of the paper, a neural predictor
is used to remove redundant information from natural text.
With certain short newspaper articles, the neural method can achieve
better compression ratios than the widely used asymptotically
optimal Lempel-Ziv string compression algorithm.
The third part of the paper shows how a system of co-evolving
neural predictors and neural code generating modules
can build factorial (statistically nonredundant)
codes of pattern ensembles. The method is successfully applied
to images of letters randomly presented according to the
probabilities of English language.

- INTRODUCTION
- WHAT IS REDUNDANT INFORMATION?
- WHAT IS REDUNDANCY REDUCTION?
- REDUNDANCY REDUCTION: WHY?
- REDUNDANCY REDUCTION: HOW?

- EXAMPLE 1: Sequence Classification

- EXAMPLE 2: Text Compression
- PREDICTING CONDITIONAL PROBABILITIES
- USING THE PREDICTOR FOR COMPRESSION
- SIMULATIONS
- ON-LINE METHODS / LIMITATIONS

- EXAMPLE 3: Discovering Factorial Codes

- OUTLOOK
- ACKNOWLEDGMENTS
- Bibliography
- About this document ...