Simple component functions (CFs).
makes (1) unit activations decrease to zero in proportion to their fan-outs, (2) first-order derivatives of activation functions decrease to zero in proportion to their fan-ins, and (3) the influence of units on the output decrease to zero in proportion to the unit's fan-in. For a detailed analysis see Hochreiter and Schmidhuber (1997a). is the reason why low-complexity (or simple) CFs are preferred.
Sparseness. Point (1) above favors sparse hidden unit activations (here: few active components); point (2) favors non-informative hidden unit activations hardly affected by small input changes. Point (3) favors sparse hidden unit activations in the sense that ``few hidden units contribute to producing the output''. In particular, sigmoid hidden units with activation function favor near-zero activations.