Databases Reference
In-Depth Information
sample database and selecting the top 5 personID values from the people table on this laptop shows 5
sequential GUIDs:
3 1 1A5A22-743F-DC11-9145-000E7B82B6DD
3 2 1A5A22-743F-DC11-9145-000E7B82B6DD
3 3 1A5A22-743F-DC11-9145-000E7B82B6DD
3 4 1A5A22-743F-DC11-9145-000E7B82B6DD
3 5 1A5A22-743F-DC11-9145-000E7B82B6DD
The people table was populated by running usp_loopPeopleInsert , if you want to try it yourself.
NewsequentialID() helps to avoid the slowdown associated with inserting random values, which might
make it appealing, but the size of the key and the complexity it introduces when debugging (complex
because it's long and not human, readable) still exists. Because of this, identity remains the most popular
clustered index key.
Natural versus Surrogate Keys
There is much debate about this topic within the database community, with good arguments for either
side and compelling justification for a middle ground. The debate is about choosing a primary key,
and your primary key doesn't have to be your clustered index key. It's a schema tuning decision and is
covered in detail in Chapter 8.
Let's review the details here as it's relevant to the discussion. The index keys that you've seen so far
are known as surrogate keys because they have no meaning to the business data; they were generated
automatically by the system. You've already seen the advantages of clustering on a surrogate identity
key above.
A natural key has meaning to the business, but the same best practice rules for a clustered index key
should be applied when evaluating it as a choice for primary key. It is much harder to find an ideal
natural key than it is to just use surrogate keys. Natural keys tend to have the risk of change because
they're known to the business. A stock code might be the first choice as a natural primary key but it's still
feasible that the business might want to change it, which will have a cascading effect.
ABadClusteredIndexChoice
A good example of a bad clustered index key would be lastname, because:
It's wide, so inserts could take longer and non-clustered indexes will also be larger.
It's not unique, so SQL Server has to apply a unique identifier, which wastes time.
It's volatile, which will cause cascading updates, and relocation, which introduces
fragmentation.
TheRightClusteredIndex
Generally the best table structure is the right clustered index. The right clustered index key is always
going to be debatable. In this chapter we use surrogate keys for most scenarios and will create one to
cluster on if it doesn't exist. However, there are always exceptions.
I tend to treat a clustered index as being for the benefit of SQL Server internally to give me the most
consistently good insert/update performance and rely on non-clustered indexes for improving query
response times.
Search WWH ::




Custom Search