Next: FLAT MINIMUM SEARCH: REVIEW Up: Source Separation as a Previous: Source Separation as a

INTRODUCTION

In the field of unsupervised learning several information-theoretic objective functions (OFs) have been proposed to evaluate the quality of sensory codes. Most OFs focus on properties of the code components -- we refer to them as code component-oriented OFs, or COCOFs. Some COCOFs explicitly favor near-factorial, minimally redundant codes of the input data [2,18,23,7,24] while others favor local codes [22,3,16]. Recently there has also been much work on COCOFs encouraging biologically plausible sparse distributed codes [20,10,25,9,6,8,12,17].

While COCOFs express desirable properties of the code itself they neglect the costs of constructing the code from the data. E.g., coding input data without redundancy may be very expensive in terms of information required to describe the code-generating network, which may need many finely tuned free parameters. We believe that one of sensory coding's objectives should be to reduce the cost of code generation through data transformations, and postulate that an important scarce resource is the bits required to describe the mappings that generate and process the codes.

Hence we shift the point of view and focus on the information-theoretic costs of code generation. We use a novel approach to unsupervised learning called ``low-complexity coding and decoding'' (LOCOCODE [15]). Without assuming particular goals such as data compression, subsequent classification, etc., but in the spirit of research on minimum description length (MDL), LOCOCODE generates so-called lococodes that (1) convey information about the input data, (2) can be computed from the data by a low-complexity mapping (LCM), and (3) can be decoded by an LCM. We will see that by minimizing coding/decoding costs LOCOCODE can yield efficient, robust, noise-tolerant mappings for processing inputs and codes.

Lococodes through regularizers. To implement LOCOCODE we apply regularization to an autoassociator (AA) whose hidden layer activations represent the code. The hidden layer is forced to code information about the input data by minimizing training error; the regularizer reduces coding/decoding costs. Our regularizer of choice will be Flat Minimum Search (FMS) [14].

Next: FLAT MINIMUM SEARCH: REVIEW Up: Source Separation as a Previous: Source Separation as a

Juergen Schmidhuber 2003-02-25