Database Reference
In-Depth Information
SELECT COUNT(word), SUM(word_count)
FROM (
SELECT word, SUM(word_count) AS word_count
FROM [publicdata:samples.shakespeare]
GROUP BY word)
Size Limitations in Ordered Results
In most SQL databases, you can add an
ORDER BY
X
clause to the end of
your query, and no matter what, your results will come back in a prescribed
order. When you have large data, however, sorting is expensive.
If you run an
ORDER BY
operation on too many values, BigQuery returns
a Too Many Results error. This error often confuses people because
ORDER
BY
is often combined with a
LIMIT
clause limiting the number of results.
How can there be too many results if you limit it to 10? The problem is that
the
ORDER BY
must be done over the entire result set before the limit is
applied—the query engine can't just take 10 random results and return those
in order.
There is only a partial workaround for the size limitation in the
ORDER BY
operation. Much of the time when you run a query that has an
ORDER BY
operation, you're interested only in the first few results. You want to find the
highest value, or the most frequent, and so on. If you run queries that have
this pattern but have a large number of potential results, you can use the
TOP
function. Following is an example of rewriting a query with an
ORDER
BY
into a
TOP
query:
SELECT corpus, COUNT(*) AS total
FROM [publicdata:samples.shakespeare]
GROUP BY corpus
ORDER BY total DESC
LIMIT 20
You can rewrite as
SELECT TOP(corpus, 20), COUNT(*) AS total
FROM [publicdata:samples.shakespeare]