Database Reference
In-Depth Information
Chapter 5
Index Fragmentation
Index fragmentation is, perhaps, one of those rare topics that does not entirely belong to the “it depends” category.
Every database professional agrees that fragmentation negatively affects the system. While that is correct, it is still
important to understand the downside of index fragmentation and analyze how your system is affected by it.
In this chapter, we will talk about internal and external index fragmentation in SQL Server, what code and
design patterns increase the fragmentation, and what factors must be taken into account when designing an index
maintenance strategy.
Types of Fragmentation
As you will remember, SQL Server stores data on data pages combined into eight-page extents. There are two kinds of
extents: mixed extents and uniform ones. Mixed extents store data that belongs to different objects. Uniform extents
store data that belongs to the same object.
When you create the object, the first eight data pages are stored in mixed extents. All subsequent space allocation
is accomplished with uniform extents only. Every data page in clustered and nonclustered indexes has pointers to the
previous and next pages based on index key sorting order.
SQL Server neither reads nor modifies data directly on the disk. A data page needs to be in the memory to be
accessible. Every time SQL Server accesses the data page in memory, it issues a logical read operation. When the data
is not in memory, SQL Server also performs a physical read, which indicates physical disk access.
You can find the number of I/O operations performed by a query on a per-table basis by enabling I/O statistics
using set statistics io on command. An excessive number of logical reads often indicates suboptimal execution
plans due to the missing indexes and/or suboptimal join strategies selected because of incorrect cardinality estimation.
However, you should not use that number as the only criteria during optimization and take other factors into account,
such as resource usage, parallelism, and related operators in the execution plan.
Note
Both logical and physical reads affect the performance of queries. Even though logical reads are very fast, they are
not instantaneous. SQL Server burns CPU cycles while accessing data pages in memory, and physical I/O operations
are slow. Even with a fast disk subsystem, latency quickly adds up with a large number of physical reads.
One of the optimization techniques that SQL Server uses to reduce number of physical reads is called
Read-Ahead . With this technique, SQL Server determines if leaf-level pages reside continuously on the disk based
on intermediate index-level information and reads multiple pages as part of single read operation from the data file.
This increases the chance that the following read requests would reference data pages, which are already cached in
memory, and it minimizes the number of physical reads required. Figure 5-1 illustrates this situation, and it shows
two adjacent extents with all data pages fully populated with data.
 
 
Search WWH ::




Custom Search