Special Indexing and Storage Features - Pro SQL Server Internals

Database Reference

In-Depth Information

Chapter 4

Special Indexing and Storage

Features

This chapter discusses several storage- and indexing-related features available in SQL Server. It covers indexes with

included columns, filtered indexes and statistics, data compression, and sparse columns.

Indexes with Included Columns

As you already know, SQL Server rarely uses nonclustered indexes when it expects that a large number of Key or RID

Lookups is required. Those operations usually lead to a large number of reads, both logical and physical.

With Key Lookup operations, SQL Server accesses multiple data pages from different levels in a clustered index

every time it needs to obtain a single row. Even though root and intermediate index levels are usually cached in the

buffer pool, access to leaf-level pages produces random, and often physical, I/O reads, which are slow, especially

in the case of magnetic hard disk drives.

This is also true for heap tables. Even though the row-id in a nonclustered index stores the physical location of the

row from the heap table, and RID lookup does not need to traverse the clustered index tree, it still introduces random

I/O. Moreover, forwarding pointers can lead to extra reads if a row had been updated and moved to another page.

The existence of Key or RID lookups is the crucial factor here. Rows in a nonclustered index are smaller than

in a clustered index. Nonclustered indexes use fewer data pages and, therefore, are more efficient. SQL Server uses

nonclustered indexes even when it expects that a large number of rows need to be selected, as long as Key or RID

lookups are not required.

As you will recall, nonclustered indexes store data from the index key columns and row-id. For tables with

clustered indexes, the row-id is the clustered key value of the index row. The values in all indexes are the same: when

you update the row, SQL Server synchronously updates all indexes.

SQL Server does not need to perform Key or RID lookups when all of the data a query needs exists in a

nonclustered index. Those indexes are called covering indexes, because they provide all of the information that a

query needs, and they are essentially a covering query.

Making nonclustered indexes covering ones is one of the most commonly used query optimization techniques.

In the past, the only way to achieve that was to add columns, referenced by the queries, as the rightmost index key

columns. Even though this method generally worked, it had a few disadvantages.

First, SQL Server stores sorted index rows based on index key values. An update of the index key columns can lead to a

situation where a row needs to be moved to a different place in the index, which increases the I/O load and fragmentation.

Second, new columns increase the size of the index key, which can potentially increase the number of levels in

the index, making it less efficient.

Finally, a nonclustered index key cannot exceed 900 bytes. As a result, you cannot add a large amount of data nor

LOB columns to the index. Even though making a large index key is not necessarily a good idea, it could be helpful in

some cases.

Search WWH ::

Custom Search

Home