Database Reference
In-Depth Information
Why Does Comma Mean UNION ALL and
Not JOIN?
If we had to come up with the one deviation from the standard in
BigQuery SQL that caused the most confusion, the choice of comma to
mean UNION ALL and not JOIN would likely be it. This topic mentions
it several times because we wanted to make sure that this nonstandard
pattern doesn't surprise users.
In BigQuery, UNION is much more common than JOIN . Users often
split their tables up by day or by customer. This partitioning can help
them manage data lifetime (for example, they delete tables older than
90 days) or to save them money on queries (because query cost is
proportional to number of bytes in the columns referenced). But by
splitting up data into multiple pieces, it means that users often want to
query against multiple tables at once. For instance, if you have daily
tables and want to query against a month's worth of data, you need to
UNION ALL 30 different tables together.
From a syntax perspective, UNION ALL is clumsy; you don't UNION
together table names, you UNION together subqueries. So to UNION a
week's worth of tables together, you'd need to write the same query
over and over again:
(SELECT foo FROM table1)
UNION ALL
(SELECT foo FROM table2)
UNION ALL
(SELECT foo FROM table3)
. . .
The BigQuery syntax lets you list the tables separated by commas:
SELECT foo FROM table1, table2, table3. . .
This is much easier to write and to understand. Are there other
syntaxes that would have been almost as compact but didn't violate
Search WWH ::




Custom Search