Assume that the alphabet contains possible characters . The (local) representation of is a binary -dimensional vector with exactly one non-zero component (at the -th position). has input units and output units. is called the ``time-window size''. We insert default characters at the beginning of each file. The representation of the default character, , is the -dimensional zero-vector. The -th character of file (starting from the first default character) is called .
For all and all possible ,
receives as an input
(1) |
(2) |
Expression (2) is minimal if
always equals
(3) |
(4) |
In general, the
will not quite match the corresponding
conditional probabilities.
For normalization purposes, we define
(5) |