The method presented in this paper is an instance of a strategy known as ``predictive coding'' or ``model-based coding''. To compress text files, a neural predictor network approximates the conditional probability distribution of possible ``next characters'', given previous characters. 's outputs are fed into algorithms that generate short codes for characters with low information content (characters with high predicted probability) and long codes for characters conveying a lot of information (highly unpredictable characters) . Two such standard coding algorithms are employed: Huffman Coding (see e.g. ) and Arithmetic Coding (see e.g. ).
With the off-line variant of the approach, 's training phase is based on a set of training files. After training, the weights are frozen. Copies of are installed at all machines functioning as message receivers or senders. From then on, is used to encode and decode unknown files without being changed any more. The weights become part of the code of the compression algorithm. Note that the storage occupied by the network weights does not have to be taken into account to measure the performance on unknown files - just like the code for a conventional data compression algorithm does not have to be taken into account.
The more sophisticated on-line variant of our approach will be addressed later.