Database Reference
In-Depth Information
•
Regular strings
: Many string-valued fields compress extremely well.
A URL, for example, nearly always starts with
http://
, and the rest of
the URL could be mostly redundant. A user-agent string is another
example, which tends to be a long string describing a browser but with a
high degree of regularity.
It is also usually more efficient to compress values of the same type than it
is to compress records containing multiple types. Floating point values have
numerical similarities, as do UTF-8 encoded characters in a string; these
similarities can be exploited by the compression algorithm to come up with
a more compact representation.
Of course, not all fields compress well. But in practice, the compression
ratio within a column is much higher than for raw text. By trading off the
time spent decompressing the data for time spent reading the data from the
network, you can scan the data even faster.