Database Reference
In-Depth Information
The columns in the output represent the following:
column_id is the ID of a column in the index.
dictionary_id is ID of a dictionary.
version represents a dictionary format. Both SQL Server 2012 and 2014 return 1 as its value.
type represents the type of values stored in a dictionary. It can have one of the following
three values:
Dictionary contains int values is specified by type = 1
Dictionary contains string values is specified by type = 3
Dictionary contains float values is specified by type = 4
last_id is a last data id in a dictionary .
entry_count contains the number of entries in a dictionary .
on_disk_size indicates the size of a dictionary in bytes.
Design Considerations and Best Practices for
Columnstore Indexes
The subject of designing efficient Data Warehouse solutions is very broad and impossible to cover completely in this
book. However, it is equally impossible to avoid such discussion entirely.
Reducing Data Row Size
Regardless of the indexing technologies in use, most I/O activity in Data Warehouse systems is related to scanning
facts tables' data. Efficient design of facts tables is one of the key factors in Data Warehouse performance.
It is always advantageous to reduce the size of a data row, and it is even more critical in the case of facts tables
in Data Warehouses. By making data rows smaller, we reduce the size of the table on-disk and the number of I/O
operations during a scan. Moreover, it reduces the memory footprint of the data and makes batch-mode execution
more efficient due to better utilization of the internal CPU cache.
As you will remember, one of the key factors in reducing data size is the usage of correct data types for values.
You can think about storing Boolean values in int data types, or using datetime when a value requires just up to the
minute precision, as examples of bad design. Always use the smallest data type that can store column values and that
provides the required precision for the data.
Giving SQL Server as Much Information as Possible
Knowledge is power. The more SQL Server knows about the data, the better the chances that an efficient execution
plan is generated.
Unfortunately, nullability of columns is one of the most obvious but frequently overlooked factors. Defining
columns as NOT NULL when appropriate helps Query Optimizer and, in some cases, reduces the storage space
required for the data. It also allows SQL Server to avoid unnecessary encoding in columnstore indexes and during
batch-mode execution.
 
Search WWH ::




Custom Search