Database Reference
In-Depth Information
The second type of encoding, called value-based encoding , is mainly used for numeric and integer data types that
do not have enough duplicated values. With this condition, dictionary encoding is inefficient. The purpose of value-based
encoding is to convert integer and numeric values to a smaller range of 64-bit integers. This process consists of the
following two steps.
In the first step, numeric data types are converted to integers using the minimum positive exponent that allows
this conversion. Such an exponent is called magnitude . For example, for a set of values such as 0.8, 1.24, and 1.1,
the minimum exponent is 2, which represents a multiplier of 100. After this exponent is applied, values would be
converted to 80, 124, and 110 respectively. The goal of this process is to convert all numeric values to integers.
Alternatively, for integer data types, SQL Server chooses the smallest negative exponent that can be applied to
all values without losing their precision. For example, for the values 1340, 20, and 2,340, that exponent is -1, which
represents a divider of 10. After this operation, the values would be converted to 134, 2, and 234 respectively. The goal
of such an operation is to reduce the interval between the minimum and maximum values stored in the segment.
During the second step, SQL Server chooses the base value , which is the minimum value in the segment, and it
subtracts it from all other values. This makes the minimum value in the segment 0.
Figure 34-17 illustrates the process of value-based encoding.
Figure 34-17. Value-based encoding
After encoding, SQL Server compresses the data and stores it as a LOB allocation unit. We have discussed how
this type of data is stored in Chapter 1 “Data Storage Internals.”
Listing 34-11 shows a query that displays allocation units for the dbo.FactSales table.
Listing 34-11. dbo.FactSales table allocation units
select i.name as [Index], p.index_id
,p.partition_number as [Partition]
,p.data_compression_desc as [Compression]
,u.type_desc, u.total_pages
from
sys.partitions p join sys.allocation_units u on
p.partition_id = u.container_id
join sys.indexes i on
p.object_id = i.object_id and
p.index_id = i.index_id
where
p.object_id = object_id(N'dbo.FactSales')
As you can see in Figure 34-18 , the columnstore index is stored as LOB_DATA. It is worth noting that this index
has IN_ROW_DATA allocation units; however, these allocation units do not store any data. It is impossible to have
LOB_DATA allocation in the index without an IN_ROW_DATA allocation present.
 
Search WWH ::




Custom Search