Database Reference
In-Depth Information
GROUP EACH BY title
ORDER BY cnt DESC
LIMIT 100
JOIN EACH
If you don't specify EACH when you perform a JOIN operation, BigQuery
requires that the table on the right side of the join be small enough to send
the entire thing to every node of the computation tree. This means that if
the table on the right side of the join doesn't fit into less than 8 MB when
compressed, your query will fail. For example, if you take the query from the
OUTER JOIN example and reverse the left and right tables and switch to an
INNER JOIN , you get:
SELECT github.repository_name,
github.repository_owner,
shakespeare.corpus_date
FROM (
SELECT corpus, max(corpus_date) AS corpus_date
FROM [publicdata:samples.shakespeare]
GROUP BY corpus) shakespeare
INNER JOIN [publicdata:samples.github_timeline] github
ON shakespeare.corpus = github.repository_name
This query fails with the error “The JOIN operator's right-side table must
be a small table.” This isn't as big of a limitation as it sounds because many
JOIN queries use a subselect to narrow down the table that is joined against;
as long as the computed result is smaller than 8 MB, the JOIN will succeed.
However, when that is not possible you can fix the query by adding the EACH
keyword to the join:
SELECT github.repository_name,
github.repository_owner,
shakespeare.corpus_date
FROM (
SELECT corpus, max(corpus_date) AS corpus_date
FROM [publicdata:samples.shakespeare]
GROUP BY corpus) shakespeare
INNER JOIN EACH [publicdata:samples.github_timeline]
Search WWH ::




Custom Search