Data Partitioning - Pro SQL Server Internals

Database Reference

In-Depth Information

Data compression is another important factor to consider. Static historical data would usually benefit from page-

level data compression, which can significantly reduce the storage space required. Moreover, it could improve the

performance of queries against historical data in non CPU-bound systems by reducing the number of I/O operations

required to read the data. At the same time, page-level data compression introduces unnecessary CPU overhead when

the data is volatile.

■ in some cases, it is beneficial to use page compression even with volatile operational data when it saves

signigicant amount of space and the system works under a heavy i/o load. as usual, you should test and monitor how

it affects the system.

Tip

Unfortunately, it is impossible to compress part of the data in a table. You would either have to compress the

entire table, which would introduce CPU overhead on operational data, or keep the historical data uncompressed at

additional storage and I/O cost.

In cases of read-only historical data, it could be beneficial to exclude it from FULL database backups. This would

reduce the size of the backup file and I/O and network load during backup operations. Regrettably, partial database

backups work on the filegroup level, which makes it impossible when the data is not partitioned.

The Enterprise Edition of SQL Server supports piecemeal restore, which allows you to restore the database

and bring it online on a filegroup-by-filegroup basis. It is great to have a disaster recovery strategy that allows you to

restore operational data and make it available to customers separately from the historical data. This could significantly

reduce the disaster recovery time for large databases.

Unfortunately, such a design requires separation of operational and historical data between different filegroups,

which is impossible when the data is not partitioned.

■ We will discuss backup and disaster recovery strategies in greater detail in Chapter 30, “Designing a

Backup Strategy.”

Note

Another important factor is statistics. As you will remember, the statistics histogram stores a maximum of 200

steps, regardless of the table size. As a result, the histogram steps on large tables must cover a bigger interval of key

values. It makes the statistics and cardinality estimation less accurate, and it can lead to suboptimal execution plans

in the case of large tables. Moreover, SQL Server tracks the number of changes in the statistics columns and by default

requires this to exceed about 20 percent of total number of rows in the table before the statistics become outdated.

Therefore, statistics are rarely updated automatically in large tables.

■ You can use undocumented trace flag t2371 in SQL Server 2008r2 Sp1 and above to make the statistics

update threshold dynamic.

Note

That list is by no means complete, and there are other factors as to why data partitioning is beneficial, although

either of the aforementioned reasons is enough to start considering it.

Search WWH ::

Custom Search

Home