Database Reference
In-Depth Information
data compression can be useful when you deal with third-party systems with excessive use of fixed-length
data types. For example, some independent software vendors use the fixed-length char data type to store text information.
implementing data compression here significantly reduces table storage space and improves system performance
transparently to the applications.
Tip
You can estimate the space saving of row and page compression by using the sp_estimate_data_compression_
savings stored procedure. This procedure copies a sample of the data to tempdb and applies the desired data
compression method, estimating the space savings. Obviously, it can produce incorrect results if data is distributed
unevenly.
You can apply different data compression methods on a per-index basis. For example, when the data is volatile,
you may consider using row compression with a clustered index, keeping some of the nonclustered indexes
uncompressed. An example of this use would be on variable-length columns or on fixed-length columns with data
types that always use all storage space.
In the case of partitioned tables, compression can be applied on a per-partition basis. For example, you may
decide to use row compression or no compression at all for partitions with volatile operational data and page
compression for static archived data.
Note
We will talk about partitioned tables in greater detail in Chapter 15, “data partitioning.”
Sparse Columns
Sparse columns , introduced in SQL Server 2008, have a special storage format optimized for the storage of NULL
values. As you will remember, without data compression, fixed-length columns always use the same storage space
even when they store NULL values. Variable-length columns use an amount of space based on the size of the value
with an extra two bytes in the variable-length offset array.
When a column is defined as sparse, it does not use any storage space when it is NULL at the cost of extra storage
overhead in cases of NOT NULL value. This storage overhead is four bytes for fixed-length data types and two bytes for
variable-length data types.
even though null fixed-length data types do not use storage space when defined as sparse, you should
not interchange them with variable-length data types. a sparse char column would be stored in-row when it is not null
and contribute towards the 8,060 maximum row size limit. alternatively, a sparse varchar column can be stored in a
row-overflow page if needed.
Caution
Sparse column data is stored in a special part of the row called the sparse vector . I am not going to dive into the
sparse vector internal storage format, but I want to mention that it is located after the variable-length portion of the
row. Moreover, the sparse vector adds extra storage overhead, which increases size of the row and counts towards the
8,060-byte limit.
Table 4-8 shows the required storage space used by the data type for a regular nonsparse column and sparse
column that stores not a null value. It also shows what the minimum percent of rows that must have NULL values to
achieve a net space saving of 40 percent.
 
 
Search WWH ::




Custom Search