Database Reference
In-Depth Information
ORDER BY has some limitations, however. At this time, it is the only
nonparallelizable operation in BigQuery. That means that the entire sort
operation must occur in a single query worker. To prevent that single worker
from getting bogged down, there are limits on the size of result that an
ORDER BY can process. If you get a Response Too Large error that you didn't
expect, the ORDER BY clause may be the culprit.
Finally, you have a limit on the number of rows that are returned:
LIMIT 10
It is generally a good idea to add a limit to your query to prevent the
dreaded Response Too Large error that is returned whenever the query
result size is larger than 128 MB (unless, of course, you have set the
allow_large_results option on your query).
If you're interested in why the Response Too Large error exists, Chapter 9
describes how query execution works and why result sizes for most queries
are limited. For the most part, however, the maximum result size is not an
issue because you usually care about only the first few results of your query.
In addition, much of the time when the query returns too many results, it
means that there was something wrong with the query—a filter condition
was missing, or something about the data was unexpected.
Query #2: Aggregation
SELECT word, COUNT(*) AS corpora, SUM(word_count) AS
total
FROM [publicdata:samples.shakespeare]
WHERE LENGTH(word) > 5
GROUP BY word
HAVING corpora >=2 AND corpora <> total
ORDER BY corpora DESC, total DESC
LIMIT 20
Although it is interesting to see the word frequency in a single Shakespeare
play, it may be more interesting to see how the word frequencies look
across all the plays that Shakespeare wrote. Query #2 returns the words
that Shakespeare uses in more than two plays and appear more than once
in some play, and ranks them by how many plays they appear in and the
Search WWH ::




Custom Search