Database Reference
In-Depth Information
From the previous index pages, you can see that the page split caused by the INSERT statement spreads the
rows sparsely across the leaf pages, causing internal fragmentation. It often causes external fragmentation also,
since the new leaf page may not be physically adjacent to the original page. For a large table with a high amount
of fragmentation, the page splits caused by the INSERT statement will require a larger number of leaf pages to
accommodate all the index rows.
To demonstrate the row distribution shown in the index pages, you can run the script to create dbo.Test1 again,
adding more rows to the pages:
INSERT INTO dbo.Test1
VALUES (410, 'C4', ''),
(900, 'C4', '');
The result is the same as for the previous example: These new rows can be accommodated in the two existing
leaf pages without causing any page split. You can validate that by calling DBCC IND and DBCC PAGE . Note that in the
first page, new rows are added in between the other rows in the page. This won't cause a page split since free space is
available in the page.
What about when you have to add rows to the trailing end of an index? In this case, even if a new page is required,
it won't split any existing page. For example, adding a new row with C1 equal to 1,300 will require a new page, but it
won't cause a page split since the row isn't added in an intermediate position. Therefore, if new rows are added in the
order of the clustered index, then the index rows will be always added at the trailing end of the index, preventing the
page splits otherwise caused by the INSERT statements.
Fragmentation caused by page splits hurts data retrieval performance, as you will see next.
Fragmentation Overhead
Both internal and external fragmentations adversely affect data retrieval performance. External fragmentation
causes a noncontiguous sequence of index pages on the disk, with new leaf pages far from the original leaf pages and
with their physical ordering different from their logical ordering. Consequently, a range scan on an index will need
more switches between the corresponding extents than ideally required, as explained earlier in the chapter. Also, a
range scan on an index will be unable to benefit from read-ahead operations performed on the disk. If the pages are
arranged contiguously, then a read-ahead operation can read pages in advance without much head movement.
For better performance, it is preferable to use sequential I/O, since this can read a whole extent (eight 8KB pages
together) in a single disk I/O operation. By contrast, a noncontiguous layout of pages requires nonsequential or
random I/O operations to retrieve the index pages from the disk, and a random I/O operation can read only 8KB of
data in a single disk operation (this may be acceptable, however, if you are retrieving only one row). The increasing
speed of hard drives, especially SSDs, has reduced the impact of this issue, but it's still there.
In the case of internal fragmentation, rows are distributed sparsely across a large number of pages, increasing the
number of disk I/O operations required to read the index pages into memory and increasing the number of logical
reads required to retrieve multiple index rows from memory. As mentioned earlier, even though it increases the cost
of data retrieval, a little internal fragmentation can be beneficial because it allows you to perform INSERT and UPDATE
queries without causing page splits. For queries that don't have to traverse a series of pages to retrieve the data,
fragmentation can have minimal impact. Put another way, retrieving a single value from the index won't be impacted
by the fragmentation; or, at most, it might have an additional level in the B-Tree that it has to travel down.
To understand how fragmentation affects the cost of a query, create a test table with a clustered index and insert
a highly fragmented data set in the table. Since an INSERT operation in between an ordered data set can cause a page
split, you can easily create the fragmented data set by adding rows in the following order:
IF (SELECT OBJECT_ID('Test1')
) IS NOT NULL
DROP TABLE dbo.Test1;
GO
 
Search WWH ::




Custom Search