Databases Reference
In-Depth Information
Histograms
In SQL Server, histograms are created only for the first column of a statistics object,
and they compress the information of the distribution of values in those columns by
partitioning that information into subsets called buckets or steps . The maximum number
of steps in a histogram is 200, but even if the input has 200 or more unique values, a
histogram may still have less than 200 steps. To build the histogram, SQL Server finds
the unique values in the column and tries to capture the most frequent ones using a
variation of the maxdiff algorithm, so that the most statistically significant information
is preserved. Maxdiff is one of the available histograms whose purpose is to accurately
represent the distribution of data values in relational databases.
To see how the histogram is used, run the following statement to display the
current statistics of the IX_SalesOrderDetail_ProductID index on the
Sales.SalesOrderDetail table:
DBCC SHOW_STATISTICS ( 'Sales.SalesOrderDetail' , IX_SalesOrderDetail_ProductID )
Listing 3-11.
Both the multi-column index and statistics objects include the columns ProductID ,
SalesOrderID , and SalesOrderDetailID , but since the histogram is only for the
first column, this data is only available for the ProductID column.
Next, I will show you some examples of how the histogram may be used to estimate the
cardinality of some simple predicates. Let's take a look at a section of the histogram, as
shown in the output in Listing 3-12.
Search WWH ::




Custom Search