Data Partitioning - Pro SQL Server Internals

Database Reference

In-Depth Information

When to Partition?

Database professionals often assume that data partitioning is required only for very large databases (VLDB). Even

though database size definitely matters, it is hardly the only factor to consider.

Service-level agreements (SLA) are one of the key elements in the decision to partition or not. When a system has

an availability-based SLA clause, data partitioning becomes essential. The duration of possible downtime depends on

how quickly you can recover a database and restore it from a backup after disaster. That time depends on the total size

of the essential filegroups that need to be online for the system to be functional. Data partitioning is the only approach

that allows you to separate data between different filegroups and use a piecemeal restore to minimize downtime.

A performance-based SLA clause is another important factor. Data partitioning can help address some of the

challenges of performance tuning. For example, by partitioning data between multiple tables, you will improve the

accuracy of statistics and you can use different indexing strategies for historical and operational data. Moreover, data

partitioning allows you to implement a tiered storage approach and put the operational part of the data on faster disks,

which improves the performance of the system. We will discuss tiered storage in greater detail later in the chapter.

The key point to remember is that you should not rely on database size as the only criteria for partitioning.

Consider data partitioning merely to be a tool that helps you address some of the challenges. This tool can be useful

regardless of database size.

Nevertheless, data partitioning comes at a cost. It changes the execution plans of queries and often requires

code refactoring. You need to keep this in mind, especially in the case of new development. When you expect a

system to collect a large amount of data in the future, it is often better to implement data partitioning at a very early

development stage. Even though data partitioning introduces development overhead, such overhead may be much

smaller than that which is involved in code refactoring and the retesting of the production system with large amounts

of data.

■ the cost of data partitioning is especially important for independent software vendors (iSV) who are deploying a

system to a large number of customers. it is a mistake to assume that data partitioning can be implemented transparently

to the client applications. even though table partitioning does not require any code changes, it affects execution plans and

can lead to performance issues, which requires code refactoring and, therefore, a separate code base to support.

Note

Finally, it is often very hard if not impossible to partition the data while keeping the database online and available

to users. Moving large amounts of data around can be time consuming and can lead to long downtimes. This is

another argument for implementing data partitioning during the initial stage of development.

Data Partitioning Techniques

There are two data partitioning techniques available in SQL Server: partitioned tables and partitioned views . Let's look

at them in detail.

■ in this chapter, we will use an order entry system, which stores order information for two and a half years, as

our example. Let's assume that we want to partition the data on a monthly basis and that our operational period consists

of two months: May and June 2014.

Note

Search WWH ::

Custom Search

Home