Tuning the Schema - Professional SQL Server 2005 Performance Tuning

Databases Reference

In-Depth Information

Another unfortunate aspect of clustered indexes is their overuse. Most publications on this subject advise

that every table should have a clustered index. In fact, when you define a primary key, SQL Server

defaults that key as a clustered index. The remainder of this section will discuss some techniques you can

use to determine when to define a clustered index on your tables.

If you've worked with SQL Server any length of time you've probably heard or read statements that

claim that every table should have a clustered index. This is simply wrong and a quick example with the

following table schema will bear that out:

CREATE TABLE Product

(ProductID

uniqueidentifier

NOT NULL,

SupplierID

uniqueidentifier

NULL)

CREATE UNIQUE CLUSTERED INDEX bad_cluster

ON Product (ProductID)

The ProductID column is defined as a uniqueidentifier data type, and a clustered index. A uniqueidenti-

fier column shouldn't be used for a clustered index.

Therefore, not all tables will benefit from a clustered index. The question then becomes, ''Is this example

the only one where a clustered shouldn't be used, or are there others?'' The remainder of this section will

address that question.

Using the following approach will help identify tables deriving benefit from a clustered index. If the table

in question doesn't meet all the requirements, don't create a clustered index.

Don't Create a Clustered Index on an Artificial Column

An artificial column is one whose value is generated by the database engine. In SQL Server, this is most

commonly achieved by two methods:

Designate a column as an identity column.

❑

Define a column's data type as uniqueidentifier and set its default value equal to NEWID().

❑

Consider the second type of artificial column. A uniqueidentifer is the same thing as a GUID. They are

globally unique and random, thus one value has absolutely no relation to another value. Defining a

clustered index on a uniqueidentifier column forces the server to order the rows physically in the table

based on a column whose values are totally random and unrelated to one another.

Now consider the first type of artificial column. The reason for excluding this requires a little under-

standing of SQL Server data page management. SQL Server recognizes two types of tables: heap and

clustered. Any table that doesn't have a clustered index is a heap table. Any table that has a clustered

index is a clustered table. When inserting rows into a heap table the server will simply add the new

row at the end of the table's data pages. However, when inserting rows into a clustered table the server

must find the proper data page to place the new row. When using a typical identity column the proper

data page will always be the last one. Therefore, defining a clustered index on an identity column forces

the server to go through the process of determining the proper data page to store a new row — even

when this location will be the last page. If the clustered index were removed then the heap nature of the

table will naturally cause new rows to be stored on the last data page, without having to go through the

process of determining where.

Professional SQL Server 2005 Performance Tuning

Search WWH ::

Custom Search

Home