This section shows how to ``predict away'' redundant information in sequences constructed from a finite set of possible input symbols. We pre-process input sequences by a network that tries to predict the next input, given previous inputs. The input vector corresponding to time step of sequence is denoted by . The networks real-valued output vector is denoted by . Among the possible input vectors there is one with minimal Euclidean distance to . This one is denoted by . is interpreted as the deterministic vector-valued prediction of .
It is important to observe that
all information about the input vector (at time ) is
conveyed by the following data:
the time ,
a description of the predictor
and its initial state,
and the set
In other words, we can forget about the predictable input vectors. We need to look at the unpredictable inputs only. Only the unexpected deserves attention . We apply this insight to sequences generated by the modified automaton above.