Database Reference
In-Depth Information
Figure 4-18 illustrates the data in the row. Keep in mind that multi-byte values are stored in byte-swapped order,
similar to the FixedVar format. Moreover, 4-bit sections in the CD array are also swapped within each byte.
Figure 4-18. Row Compression: Row data
It is worth repeating that default type values, for example zeroes for int and bit data types, do not use storage
space outside of the four-bits in the CD region.
Page Compression
Page compression works differently than row compression. It is applied to the entire page, only after the page is
full and only when the compression saves a significant amount of space on the page. Moreover, SQL Server does
not use page compression on non-leaf index levels—they are compressed with row compression when page
compression is used.
Page compression consists of three different stages. First, SQL Server performs row compression of the rows.
Next, it performs prefix compression on the column level by locating and reusing the most common prefix, which
reduces the data size for values in that column. Finally, SQL Server does a dictionary compression by removing all of
the duplicates in the data across all columns. Let's examine prefix and dictionary compressions in depth.
At a first step, SQL Server detects the most common prefix in a column's data and finds the longest value that is
using such a prefix. This value is called the anchor value . All other rows on the page store the difference between their
values and the anchor values, rather than the actual values.
Let's look at an example, and assume that we have a four-column table with the data shown in Table 4-4 .
Table 4-4. Page Compression: Original Data
Column 1
Column 2
Column 3
Column 4
PALETTE
CAN
NULL
PONY
PAL
BALL
MILL
HORSE
POX
BILL
MALL
TIGER
PILL
BOX
MAN
BUNNY
 
Search WWH ::




Custom Search