Next: AUTOENCODERS
Up: ALTERNATIVE DEFINITIONS OF
Previous: ALTERNATIVE DEFINITIONS OF
We write

(5) 
and minimize subject to the
constraint

(6) 
Here, as well as throughout the remainder of this paper,
subscripts of symbols denoting vectors denote vector components:
denotes the th element of some vector .
is a positive constant, and
denotes the mean of the th output unit of .
It is possible to show that
the first term on the right hand side of
(5) is maximized subject to (6) if each input pattern is locally
represented (just like with winnertakeall networks) by exactly
one corner of the dimensional hypercube spanned
by the possible output vectors, if there are sufficient
output units [Prelinger, 1992]
^{2}.
Maximizing the second negative term
encourages each local class representation to
become active in response to only
th of all possible input patterns.
Constraint (6) is enforced by setting
where is the activation vector
(in response to ) of a dimensional layer
of hidden units of
which can be considered as
its unnormalized output layer.
This novel method is easy to implement 
it achieves an effect similar to the one of the recent
entropybased method by
Bridle and MacKay (1992).
Next: AUTOENCODERS
Up: ALTERNATIVE DEFINITIONS OF
Previous: ALTERNATIVE DEFINITIONS OF
Juergen Schmidhuber
20030213