Understanding Query Execution - Google BigQuery Analytics

Database Reference

In-Depth Information

column referenced in the query. As it scans the table, the shard walks

through the opened column files in parallel, one row at a time.

The first portion of the query to be applied is the WHERE clause, which

filters out rows that aren't interesting. In this case, all rows where the

word length is less than 4 are dropped, and the regular expression is

applied to remove words in all caps.

The shards can apply an optimization here—they need to return only

the top five values ordered by word count. To find the top five values,

they use a data structure called a priority queue , which is an efficient

way of keeping track of the top values without having to sort them all in

place.

3. Each shard returns its top 5 results to the mixer. The mixer can also use

a priority queue to keep track of the top five results from the shards,

and after all the shards have completed, it can return results back to the

caller. The mixer doesn't have to worry about the WHERE clause because

all the values it handles should be valid responses to the query.

If there was no ORDER BY operation, the mixer could perform another

optimization; it could return immediately after it had received five valid

results from the shards. But in this case it has to wait until all the shards

have finished because the top five values might be returned only in the

last shard.

GROUP BY Queries

You saw how a simple query that just does a filter and a sort operates. Now

look at a slightly more complex query that does aggregation:

SELECT corpus, SUM(word_count) AS total_words

FROM [publicdata:samples.shakespeare]

WHERE LENGTH(word) > 4

GROUP BY corpus

ORDER BY total_words DESC

LIMIT 5

This query computes the top five Shakespeare plays in terms of total word

count, excluding words shorter than five characters, and returns them in

reverse order. For example, the top result is Richard III , which has 11,514

words, followed by Hamlet at 11,439 words.

Search WWH ::

Custom Search

Home