Special Indexing and Storage Features - Pro SQL Server Internals

Database Reference

In-Depth Information

■ data compression can be useful when you deal with third-party systems with excessive use of fixed-length

data types. For example, some independent software vendors use the fixed-length char data type to store text information.

implementing data compression here significantly reduces table storage space and improves system performance

transparently to the applications.

Tip

You can estimate the space saving of row and page compression by using the sp_estimate_data_compression_

savings stored procedure. This procedure copies a sample of the data to tempdb and applies the desired data

compression method, estimating the space savings. Obviously, it can produce incorrect results if data is distributed

unevenly.

You can apply different data compression methods on a per-index basis. For example, when the data is volatile,

you may consider using row compression with a clustered index, keeping some of the nonclustered indexes

uncompressed. An example of this use would be on variable-length columns or on fixed-length columns with data

types that always use all storage space.

In the case of partitioned tables, compression can be applied on a per-partition basis. For example, you may

decide to use row compression or no compression at all for partitions with volatile operational data and page

compression for static archived data.

■

Note

We will talk about partitioned tables in greater detail in Chapter 15, “data partitioning.”

Sparse Columns

Sparse columns , introduced in SQL Server 2008, have a special storage format optimized for the storage of NULL

values. As you will remember, without data compression, fixed-length columns always use the same storage space

even when they store NULL values. Variable-length columns use an amount of space based on the size of the value

with an extra two bytes in the variable-length offset array.

When a column is defined as sparse, it does not use any storage space when it is NULL at the cost of extra storage

overhead in cases of NOT NULL value. This storage overhead is four bytes for fixed-length data types and two bytes for

variable-length data types.

■ even though null fixed-length data types do not use storage space when defined as sparse, you should

not interchange them with variable-length data types. a sparse char column would be stored in-row when it is not null

and contribute towards the 8,060 maximum row size limit. alternatively, a sparse varchar column can be stored in a

row-overflow page if needed.

Caution

Sparse column data is stored in a special part of the row called the sparse vector . I am not going to dive into the

sparse vector internal storage format, but I want to mention that it is located after the variable-length portion of the

row. Moreover, the sparse vector adds extra storage overhead, which increases size of the row and counts towards the

8,060-byte limit.

Table 4-8 shows the required storage space used by the data type for a regular nonsparse column and sparse

column that stores not a null value. It also shows what the minimum percent of rows that must have NULL values to

achieve a net space saving of 40 percent.

Search WWH ::

Custom Search

Home