Databases Reference
In-Depth Information
The Query Optimizer can select a Hash Aggregate for big tables where the data is not
sorted, there is no
need
to sort it, and its cardinality estimates only a few groups. For
example, the
SalesOrderHeader
table has no index on the
ContactID
column, so the
following query will use a Hash Aggregate operator, as shown in Figure 2-16.
SELECT
ContactID
,
COUNT
(*)
FROM
Sales
.
SalesOrderHeader
GROUP
BY
ContactID
Listing 2-19.
Figure 2-16:
A Hash Aggregate.
As mentioned earlier in this chapter, a hash operation builds a hash table in memory. The
hash key used for this table is displayed on the Properties window, as the
Hash Keys Build
property, as shown in Figure 2-17, which in this case is
ContactID
. Since this table is not
sorted by the required column,
ContactID
, every row scanned can belong to any group.
The algorithm for the Hash Aggregate operator is similar to the Stream Aggregate, with
the exceptions that, in this case, the input data is not sorted, a hash table is created in
memory, and a hash value is calculated for each row processed. For each hash value
calculated, the algorithm will check if the corresponding group already exists on the
hash table and, if it does not, it will create a new entry for it. In this way, the values for
each record are aggregated in this entry on the hash table, and only one row for each
group is stored in memory.