The input ensemble considered in this subsection consists
of four different patterns denoted by , , , and ,
respectively. The probabilities of the patterns were
code : , , , .
With code , the total objective function becomes . A non-factorial but invertible (information-preserving) code is given by
code : , , , .
With code , , which is only below . This already indicates that certain local maxima of the internal state's objective function may be very close to the global maxima.
Experiment 1: off-line, , , distributed input representation with , , , , 1 hidden unit per predictor, 2 hidden units shared among the representational modules. 10 test runs with 2,000 epochs for the representational modules were conducted. Here one epoch consisted of the presentation of 9 patterns - was presented once, was presented twice, was presented twice, was presented four times.
In 7 cases, the system found a global maximum corresponding to a factorial code. In the remaining cases the code was not invertible.
Experiment 2 (Occam's Razor): Like experiment 1, but with . In all but one of the 10 test runs the system developed a factorial code (including one unused unit). In the remaining test run the code was at least invertible.
With local input representation and , , the success rate dropped below 50 percent. With , the system usually found invertible but rarely factorial codes. This reflects the fact that with certain input ensembles there is a trade-off between redundancy and invertibility: Superfluous degrees of freedom among the representational units may increase the probability that an information-preserving code is found, while at the same time decreasing the probability of finding an optimal factorial code.