Databases Reference
In-Depth Information
There's more...
Fragmentation comes into the picture when the logical sorting order of data in the leaf pages
of an index doesn't match with the physical order of data in the actual data pages. When you
create an index, it sorts everything, but when data gets manipulated by DML commands,
there is no guarantee about which data pages will be used to accommodate the new records.
When you delete records, they get deleted from the actual data page, and the free space in
that data page might be used for any other records, which creates fragmentation.
Just keep in mind that when an index is first built up, there is no or very less
fragmentation, but after having INSERT , UPDATE , and DELETE DMLs on the
table, it starts creating fragmentation.
Playing with Fill Factor
It is already discussed in the previous chapter, Chapter 9 , Implementing Index , that when an
index is being created, it stores data in the B-Tree format that has a root page, intermediate
level, and leaf level. Leaf level, which is the bottom-most level, contains the actual data in a
clustered index. Each data page is 8 KB in size.
When insertion/updation/deletion of data to/from a table happens, it tries to insert in the
proper data page according to the record being inserted. For example, we have clustered
index on SSN number. We are inserting a new row with the SSN number. SQL Server tries to
insert that record in an appropriate data page. Suppose our SSN number starts with "2", then
it will find the last page that has SSN number starting with "2". SQL Server will insert our new
row in that page only. If your 8 KB page is full or doesn't have enough room to accommodate
the new row whose SSN starts with "2", it will split the page, and the data in that page will
be shared between two or more pages of records, based on the size of the row whose SSN
number starts with "2". Now we have two pages that are half filled, considering that the row
is not too big; otherwise, we might have more than two pages as well. So our new row will be
accommodated in that page. SQL Server tends to add new pages on the right-hand side of the
current page in hierarchy of data pages.
If there is enough space in the data page that can accommodate the new row of SSN number
starting with "2", the process doesn't need to wait till the page is getting split and then
finish the I/O overhead. There is one more overhead; the page split task locks the page and
prevents its usage until the process is finished.
This is the time when Fill Factor comes into the picture. Fill Factor decides how much of your
page would be filled up initially. Suppose you give "10" as the Fill Factor, then your data page
will consume only 10 percent of your 8 KB page size, and when you exceed this limit of 10
percent, it will keep 90 percent of the page empty and create a new page for other records.
 
Search WWH ::




Custom Search