According to our analysis LOCOCODE
attempts to describe single inputs with as few
and as simple features as possible.
Given the statistical properties of many visual inputs
(with few defining features),
this typically results in sparse codes.
Unlike objective functions of previous methods,
however, LOCOCODE's does *not* contain an explicit term
enforcing, say, sparse codes --
sparseness or independence are not
viewed as a good things *a priori.*
Instead we focus
on the information-theoretic
complexity of the mappings used for coding and decoding.
The resulting codes typically compromise between conflicting
goals. They tend to be sparse and exhibit
*low but not minimal* redundancy --
if the cost of minimal redundancy is too high.

Our results suggest that LOCOCODE's objective may embody a general principle of unsupervised learning going beyond previous, more specialized ones. We see that there is at least one representative (FMS) of a broad class of algorithms (regularizers that reduce network complexity) which (1) can do optimal feature extraction as a by-product, (2) outperforms traditional ICA and PCA on visual source separation tasks, and (3) unlike ICA does not even need to know the number of independent sources in advance. This reveals an interesting, previously ignored connection between regularization and ICA, and may represent a first step towards unification of regularization and unsupervised learning.

**More.**
Due to space limitations,
much additional theoretical and experimental analysis
had to be left to a tech report (29 pages, 20 figures)
on the WWW: please see
[15].

**Acknowledgments.**
This work was supported by *DFG grant
SCHM 942/3-1* from ``Deutsche Forschungsgemeinschaft.''