Tuning Indexes - Professional SQL Server 2005 Performance Tuning

Databases Reference

In-Depth Information

Clustered Indexes

The leaf level of a clustered index contains all the data, which is why we refer to it as defining the physical

ordering of the rows. A common question is, ''What should I put my clustered index on?'' Here are some

sensible guidelines that you can follow to make that decision a bit easier:

❑

Unique data: A clustered index key (the column(s) you choose to create a clustered index on) has

to be unique. If the data itself isn't unique then SQL Server will add a 4-byte UniqueIdentifier to

every duplicate value to allow you to use that key. You can avoid this unnecessary overhead

(as small as it might be) by creating your index on a column that already has unique values.

❑

Narrow data: A narrow column will help to keep your clustered index b-tree short and will help

to keep your other indexes smaller. The clustered index key is duplicated at the leaf-level of

every non-clustered index you create so it affects all your other indexes.

❑

Static data: Changes to the key can be costly, as the change needs to be cascaded to the non-

clustered indexes and the foreign keys if your cluster key is also the primary key. Your clus-

ter key should be static to avoid cascading updates. See Chapter 8 for details on primary/foreign

keys and referential integrity.

ClusteringonanIdentityColumn

An identity column in SQL Server is an automatically generated incremental integer that makes an ideal

clustered index key.

❑

It's naturally unique within the table.

❑

It's narrow because an integer is only 4 bytes in size.

❑

It's static because it has no relevance to the business data.

❑

It's incremental.

This creates a natural ''hot spot'' because the pages needed for INSERT will already be in cache and it

helps to reduce fragmentation caused by page-splits. The caveat is that this isn't appropriate for a very

high number of inserts because the hot spot will become the bottleneck. If you're going to have more

than 400 inserts a second to a table, you should look at alternatives.

UniqueIdentifier/GUID

UniqueIdentifier is a SQL Server data type for GUIDs (Globally Unique IDentifiers) which are 16-byte

computer generated values created based on your hardware (to an extent) to try and guarantee unique-

ness. It's not actually unique but getting a duplicate is comparable in probability to winning the lottery.

They would seem to be ideal for merging data from different sources, because they'll be unique from

any source, but GUIDs tend to be avoided as a clustering key because of their large size, complexity in

debugging, and their random nature, causing slow inserts. They are created by the newID() function and

you can test it yourself by running select newID() . On one of our laptops this was the GUID it produced:

D29F244C-541C-4542-9CEF-BEF305FDA3D4

SQL Server 2005 has a new function for creating GUIDs that ensures that each one is greater than the

previous one created on the same machine. The function is newsequentialID() , and it can only be used

as the DEFAULT value for a column with datatype UNIQUEIDENTIFIER. There is such a column in the

Professional SQL Server 2005 Performance Tuning

Search WWH ::

Custom Search

Home