EXPERIMENT 3: More Realistic Visual Data

Task 3.1. As in Experiment 2 the goal is to extract features from visual data. The input data is more realistic though -- we use the aerial shot of a village.

**Figure 7:** *Task 3.1 -- village image. Image sections used for training (left) and testing (right).*
figure=vilf.eps,angle=-90,width=1.0

Details. Figure 7 shows two images with $150 \times 150$ pixels, each taking on one of 256 gray levels. $7 \times 7$ pixels subsections from the left hand side (right hand side) image are randomly chosen as training inputs (test inputs), where gray levels are scaled to input activations in

. Training stop: after 150,000 training examples. Parameters: learning rate: 1.0, $E_{tol} = 3.0$ , $\Delta \lambda = 0.05$ . Architecture: (49-25-49). $E_{tol} = 3.0$ ,

Image structure. The image is mostly dark except for certain white regions. In a preprocessing stage we map pixel values above 119 to 255 (white) and pixel values below 120 to 9 different gray values. The largest reconstruction errors will be due to absent information about white pixels. Our receptive fields are too small to capture structures such as lines (streets).

Results: sparse codes, on-center-off-surrounds. 6 trials led to similar results (6 trials seem sufficient due to tiny variance). Only 9 to 11 HUs survive. They indeed reflect the structure of the image (compare the preprocessing stage): (1) Informative white spots are captured by on-center-off-surround HUs. (2) Since the image is mostly dark (this also causes the off-surround effect), all output units are negatively biased. (3) Since most bright spots are connected (most white pixels are surrounded by white pixels), output/input units near an active output/input unit tend to be active, too (positive weight strength decreases as one moves away from the center). (4) The entire input is covered by on-centers of surviving units -- all white regions in the input will be detected. (5) The code is sparse: few surviving white-spot-detectors are active simultaneously because most inputs are mostly dark. Figure 8 depicts typical weights on connections to and from HUs (output units are negatively biased). 10 units survive.

**Figure 8:** *Task 3.1 (village). Left: LOCOCODE's input-to-hidden weights. Right: hidden-to-output weights. Most units are essentially pruned away.*
figure=vil7b.eps,angle=0,width=1.0

PCA and ICA. Figure 9 shows results for PCA and ICA. PCA-10 codes and ICA-10 codes are about as informative as 10 component lococodes (ICA-10 a bit more and PCA-10 less). PCA-15 codes convey no more information: LOCOCODE and ICA suit the image structure better. Because there is no significant difference between subsequent PCA eigenvalues after the 8th, LOCOCODE did find an appropriate number of code components.

**Figure 9:** *Task 3.1 (village). PCA and ICA (with 10 and 15 components): weights to code components.*
figure=vil7a.eps,angle=0,width=1.0

Figure 10 depicts the reconstructed test image codes with code components mapped to 100 intervals. Reconstruction is limited to $147 \times 147$ pixels of the image covered by $21 \times 21$ input fields of size $7 \times 7$ (the 3 remaining stripes of pixels on the right and lower border are black). Code efficiency and reconstruction error averaged over the test image are given in Table 2. The bits required for coding the $147 \times 147$ section of the test image are: LOCOCODE: 14,108, ICA-10: 16,255, PCA-10: 16,312 and ICA-15: 23,897.

**Figure 10:** Task 3.1 (village). $147 \times 147$ pixels of test images reconstructed by LOCOCODE, ICA-10, PCA-10 and ICA-15. Code components are mapped to 100 discrete intervals. The second best method (ICA-10) requires 15 % more bits than LOCOCODE.
figure=vil7t.eps,angle=-90,width=1.0

Task 3.2. Like Task 3.1, but the inputs stem from a $150 \times 150$ pixels section of an image of wood cells (Figure 11: left: training image, right: test image). $E_{tol} = 1.0$ , $\Delta \lambda = 0.01$ . Training stop: after 250,000 training examples. All other parameters are like in Task 3.1.

**Figure 11:** *Task 3.2 -- wood cells. Image sections used for training (left) and testing (right).*
figure=cellf.eps,angle=-90,width=1.0

Image structure. The image consists of elliptic cells of various sizes. Cell interiors are bright; cell borders dark.

**Figure 12:** *Task 3.2 (cells). Left: LOCOCODE's input-to-hidden weights. 11 units survive.*
figure=cellb2.eps,angle=0,width=1.0

Results. 4 trials led to similar results (4 trials seem sufficient due to tiny variance). Bias weights to HUs are negative. To activate some HU, its input must match the structure of the incoming weights to cancel the inhibitory bias. 9 to 11 units survive. They are obvious feature detectors and can be characterized by the positions of the centers of their on-center-off-surround structures relative to the input field. They are specialized on detecting the following cases: the on-center is north, south, west, east, northeast, northwest, southeast, southwest of a cell, or centered on a cell, or between cells. Hence the entire input is covered by position-specialized on-centers.

Figure 12 depicts typical weights on connections to and from HUs. Typical feature detectors: unit 20 detects a southeastern cell; unit 21 western and eastern cells; unit 23 cells in the northwest and southeast corners.

PCA and ICA. Figure 13 shows results for PCA and ICA. PCA-11 codes and ICA-11 are about as informative as the 11 component lococode (ICA-11 a little less and PCA-11 more). It seems that both LOCOCODE and ICA detect relevant sources: the positions of the cell interiors (and cell borders) relative to the input field. Gaps in the PCA eigenvalues occur between the 10th and the 11th, and between the 15th and the 16th. LOCOCODE essentially found the first gap.

**Figure 13:** *Task 3.2 (cells). PCA and ICA (with 11 and 15 components): weights to code components.*
figure=cella.eps,angle=0,width=1.0

Task 3.3. Like Task 3.1 -- but now we use images of striped piece of wood. See Figure 14. $E_{tol} = 0.1$ . Training stop: after 300,000 training examples. All other parameters are like in Task 3.1.

Image structure. The image consists of dark vertical stripes on a brighter background.

**Figure 14:** *Task 3.3 -- striped wood. Image sections used for training (left) and testing (right).*
figure=piecef.eps,angle=-90,width=1.0

Results. 4 trials led to similar results Only 3 to 5 of the 25 HUs survive and become obvious feature detectors, now of a different kind: they detect whether their receptive field covers a dark stripe to the left, to the right, or in the middle.

**Figure 15:** *Task 3.3 (stripes). Left: LOCOCODE's input-to-hidden weights. 4 units survive.*
figure=pieceb2.eps,angle=0,width=1.0

Figure 15 depicts typical weights on connections to and from HUs. Example feature detectors: unit 6 detects a dark stripe to the left, unit 11 a dark stripe in the middle, unit 15 dark stripes left and right, unit 25 a dark stripe to the right.

PCA and ICA. See Figure 16. PCA-4 codes and ICA-4 codes are about as informative as 4-component lococodes. Component structures of PCA/ICA codes and lococodes are very similar: all detect the positions of dark stripes relative to the input field. Gaps in the PCA eigenvalues occur between 3rd and 4th, 4th and 5th, 5th and 6th. LOCOCODE automatically extracts about 4 relevant components.

**Figure 16:** *Task 3.3 (stripes). PCA and ICA (with 11 and 15 components).*
figure=piecea.eps,angle=0,width=1.0