© 1997 by British Computer Society
| ||||||||||||||||||||||||||||||||||||||||||||||||||
Data Compression Using a Sort-Based Context Similarity Measure
Department of Computer Science, Gunma University, Kiryu, Gunma 376, Japan Email: yokoo{at}cs.gunma-u.ac.jp
Every symbol in the data can be predicted by taking the immediately preceding symbols, or context, into account. This paper proposes a new adaptive data-compression method based on a context similarity measure. We measure the similarity of contexts using a context sorting mechanism. The aim of context sorting is to store a set of contexts in a specific order so that contexts more similar to the current one are more accessible. The proposed method predicts the next symbol by ranking the previous contextsymbol pairs in order of context similarity. The codeword for the next symbol represents the rank of the symbol in this ordered sequence. The compression performance is evaluated both analytically and empirically. Although the proposed method uses no probability distribution to make a prediction, it gains good compression. It also reveals a strong relation between symbol-ranking compression and the ZivLempel textual substitution method.
Received June 28, 1996. revised April 30, 1997.