next up previous
Next: NON-UNIFORMLY DISTRIBUTED INPUTS Up: EXPERIMENTS Previous: UNIFORMLY DISTRIBUTED INPUTS

OCCAM'S RAZOR AT WORK

The experiments in this section are meant to verify the effectiveness of Occam's razor, mentioned in the introduction. It is interesting to note that with non-factorial codes predictability minimization prefers to reduce the number of used units instead of minimizing the sum of bit-entropies à la Barlow et al. (1989). This can be seen by looking at an example described by Mitchison in the appendix of Barlow et al.'s paper. This example shows a case where the minimal sum of bit-entropies can be achieved with an expansive local coding of the input. Local representations, however, maximize mutual predictability: With local representations, each unit can always be predicted from all the others. Predictability minimization tries to avoid this by creating non-local, non-expansive codings.

Experiment 1: off-line, $dim(y) = 3$, $dim(x) = 4$, local input representation, 3 hidden units per predictor, 4 hidden units shared among the representational modules. 10 test runs with 10,000 epochs for the representational modules were conducted. In 7 cases the system found a binary factorial code: In the end, one of the output units always emitted a constant value. In the remaining 3 cases, the code was at least binary and invertible.

Experiment 2: off-line, $dim(y) = 4$, $dim(x) = 4$, local input representation, 3 hidden units per predictor, 4 hidden units shared among the representational modules. 10 test runs with 10,000 epochs for the representational modules were conducted. In 5 cases the system found a binary factorial code: In the end, two of the output units always emitted a constant value. In the remaining cases, the code did not use the minimal number of output units but was at least binary and invertible.

Experiment 3: on-line, $dim(y) = 4$, $dim(x) = 2$, distributed input representation, 2 hidden units per predictor, 4 hidden units shared among the representational modules. 10 test runs with 250,000 pattern presentations were conducted. This was sufficient to always find a quasi-binary factorial code: In the end, two of the output units always emitted a constant value. In 7 out of 10 cases, less than 100,000 pattern presentations (corresponding to 25,000 epochs) were necessary.


next up previous
Next: NON-UNIFORMLY DISTRIBUTED INPUTS Up: EXPERIMENTS Previous: UNIFORMLY DISTRIBUTED INPUTS
Juergen Schmidhuber 2003-02-13


Back to Independent Component Analysis page.