next up previous
Next: Application to Text Compression Up: III. OFF-LINE METHODS Previous: Arithmetic Coding


This section presents another alternative way of ``predicting away'' redundant information in sequences. Again, we pre-process input sequences by a network that tries to predict the next input, given previous inputs. The input vector corresponding to time step $t$ of sequence $p$ is denoted by $x^p(t)$. The networks real-valued output vector is denoted by $y^p(t)$. Among the possible input vectors, there is one with minimal Euclidean distance to $y^p(t)$. This one is denoted by $z^p(t)$. $z^p(t)$ is interpreted as the deterministic vector-valued prediction of $x^p(t+1)$.

It is important to observe that all information about the input vector $x^p(t_k)$ (at time $t_k$) is conveyed by the following data: the time $t_k$, a description of the predictor and its initial state, and the set

\{ (t_s, x^p(t_s)) ~~with~~ 0 < t_s \leq t_k, z^p(t_s - 1) \neq x^p(t_s) \}.

In what follows, this observation will be used to compress text files.


Juergen Schmidhuber 2003-02-13