Databases Reference
In-Depth Information
Dictionary Techniques
5.1 Overview
In the previous two chapters we looked at coding techniques that assume a source
that generates a sequence of independent symbols. As most sources are corre-
lated to start with, the coding step is generally preceded by a decorrelation step.
In this chapter we will look at techniques that incorporate the structure in the
data in order to increase the amount of compression. These techniques—both
static and adaptive (or dynamic)—build a list of commonly occurring patterns and encode
these patterns by transmitting their index in the list. They are most useful with sources that
generate a relatively small number of patterns quite frequently, such as text sources and com-
puter commands. We discuss applications to text compression, modem communications, and
image compression.
5.2 Introduction
In many applications, the output of the source consists of recurring patterns. A classic example
is a text source in which certain patterns or words recur constantly. Also, there are certain
patterns that simply do not occur, or if they do occur, it is with great rarity. For example,
sgiomlaireached might be an annoying habit but the word probably occurs in a very small
fraction of the text sources in existence.
A very reasonable approach to encoding such sources is to keep a list, or dictionary ,
of frequently occurring patterns. When these patterns appear in the source output, they are
 
 
Search WWH ::




Custom Search