D. METHOD 3

Next: Application to Text Compression Up: III. OFF-LINE METHODS Previous: Arithmetic Coding

D. METHOD 3

This section presents another alternative way of ``predicting away'' redundant information in sequences. Again, we pre-process input sequences by a network that tries to predict the next input, given previous inputs. The input vector corresponding to time step of sequence is denoted by . The networks real-valued output vector is denoted by . Among the possible input vectors, there is one with minimal Euclidean distance to . This one is denoted by . is interpreted as the deterministic vector-valued prediction of .

It is important to observe that all information about the input vector (at time ) is conveyed by the following data: the time , a description of the predictor and its initial state, and the set

$\begin{displaymath} \{ (t_s, x^p(t_s)) ~~with~~ 0 < t_s \leq t_k, z^p(t_s - 1) \neq x^p(t_s) \}. \end{displaymath}$

In what follows, this observation will be used to compress text files.

Subsections

Application to Text Compression

Juergen Schmidhuber 2003-02-13