This paper sequentially addresses three reasons for redundancy reduction:

- 1.
*Redundancy reduction can help to decrease the search space for goal directed learning procedures.*The next section will show this in the context of sequence classification, where redundancy reduction can sometimes help to achieve enormous learning speed-ups in comparison to more traditional approaches. - 2.
*Redundancy reduction allows for data compression.*Section 3 will review a ``neural'' method for text compression. With certain short newspaper articles, the neural method can achieve better compression ratios than the widely used asymptotically optimal Lempel-Ziv string compression algorithm (as embodied by the UNIX functions ``compress'' and ``gzip''). - 3.
*Redundancy reduction promises to simplify statistical classifiers.*For efficiency reasons, most statistical classifiers (e.g. Bayesian pattern classifiers) assume statistical independence of their input variables (corresponding to the pattern components). We would like to have a method that takes an arbitrary pattern ensemble as an input and generates an equivalent factorial code. The code (instead of the original ensemble) could be fed into an efficient conventional classifier, which in turn could achieve its optimal performance. Section 4 will review a ``neural'' method designed to generate (nearly) factorial binary codes.