** Next:** Low-Complexity Autoassociators
** Up:** Second term favors few,
** Previous:** Second term favors few,

###

SPECIAL CASE: LINEAR OUTPUT ACTIVATION.

Since our targets will
usually be in the linear range of a sigmoid output activation function,
let us consider the linear case in more detail.
Suppose all output units use the same linear
activation function
(where is
a real-valued constant).
Then
for
hidden unit . We obtain

where denotes the outgoing weight vector of unit with
,
the Euclidean vector norm
,
and the th component of a vector.
**Few component functions preferred.**
We observe that hidden units whose outgoing weight vectors have
near-zero weights yield small
contributions to , that is, the number of CFs will get
minimized.

**Common component functions preferred.**
Outgoing weight vectors of hidden units are encouraged to
have a large effect on the output
(see denominator in the last term in the brackets of ).
This implies preference of CFs that can be
used for generating many or all output components.

**CF separation -- few relevant CFs per
output unit.**
On the other hand, two hidden units whose outgoing weight vectors
do not solely consist of near-zero weights are encouraged
to influence the output in
different ways by not representing the same input feature
(see numerator in the last term in the brackets of ).
In fact, FMS punishes not only outgoing weight vectors with same or
opposite directions but also vectors obtained by
flipping the signs of the weights
(multiple reflections from
hyperplanes trough the origin and orthogonal to one axis).
Hence two units performing redundant tasks, such as both activating
some output unit, or one activating it and the other de-activating it,
will cause large contributions to .
This encourages separation of CFs and use of
few CFs per output unit.

** Next:** Low-Complexity Autoassociators
** Up:** Second term favors few,
** Previous:** Second term favors few,
Juergen Schmidhuber
2003-02-13

Back to Independent Component Analysis page.