Query Optimization and Execution - Pro SQL Server Internals

Database Reference

In-Depth Information

Hash Aggregate

A hash aggregate is very similar to a hash join. It is targeted towards large input and requires memory to store the

hash table. The hash aggregate algorithm is shown in Listing 25-12.

Listing 25-12. Hash Aggregate algorithm

for each row R1 from input

begin

calculate hash value of R1 group columns

check for a matching row in hash table

if matching row exists

update aggregate results of matching row

else

insert new row into hash table

end

return all rows from hash table with aggregate results

Similar to a hash join, a hash table can be spilled to tempdb , which negatively affects the performance of

the aggregate.

Comparing Aggregates

As with joins, stream and hash aggregates are targeted towards different use-cases. A stream aggregate works best

with sorted input—either because of existing indexes or when the amount of data is small and can be easily sorted.

A hash aggregate, on the other hand, is targeted towards large unsorted inputs.

Table 25-3 compares hash and stream aggregates.

Table 25-3. Aggregate comparison

Stream Aggregate

Hash Aggregate

Best use-case

Small size input where data can be sorted with the Sort

operator or pre-sorted input.

Medium-to-large unsorted input.

Requires sorted input

Yes

No

Blocking

No. However it often requires a blocking Sort operator.

Yes

Uses memory

No

Yes

Uses tempdb

No

Yes in case of spills

You should consider the cost of the Sort operator during performance tuning if it is used only to support a stream

aggregate pre-requirement. The cost of sorting usually exceeds the cost of stream aggregation itself. You can often

remove it by creating indexes, which would sort the data in the order required for a stream aggregate.

Spools

Spool operators, in a nutshell, are internal in-memory or on-disk caches/temporary tables. SQL Server often uses

spools for performance reasons to cache the results of complex subexpressions that need to be used several times

during query execution.

Pro SQL Server Internals

Search WWH ::

Custom Search

Home