Understanding Query Execution - Google BigQuery Analytics

Database Reference

In-Depth Information

When to Use Shuffled Queries

GROUP EACH BY and JOIN EACH BY queries are extremely powerful

because they enable you to operate over larger and more complex data

sets than you otherwise would be able to process. It is tempting,

therefore, to use the keyword everywhere. However, there are some

disadvantages to using the EACH keyword. Most obvious, using EACH

adds a lot of extra processing to the query, which can slow down

performance significantly. Although it is difficult to estimate how long a

shuffle operation will take, you can count on anywhere from 2 seconds

to 60 seconds added time, depending on table sizes.

What might not be obvious is that there are other cases in which a

shuffle query is slower—when too many results all hash to the same

shard. Because a single shard must process every row that hashes to a

particular value, if a single value is too common, this can overload the

shard and end up taking much longer to process.

A simple example of this would be if you imagine a huge customer table

with a Country field. If 99 percent of your customers are in the United

States, if you do a GROUP EACH BY Country , a single shard needs to

process 99 percent of the rows. This process can take a long time and

can even lead to your queries being canceled.

It is likely that future versions of BigQuery will be smarter about

picking whether to do a shuffle, so you won't have to worry about it. For

now, however, you should probably default to not using the EACH

keyword unless you need it.

Materialize Queries

When you run a normal query in BigQuery, you're limited to results that

are less than 128 MB. If you think about how the query runs in the Dremel

serving tree, you can see why this limit exists—because everything has to be

returned through the root mixer. If you had 100 GB of results, this would be

a lot of data to return from a single server, with a lot of data to aggregate in

one place.

Search WWH ::

Custom Search

Home