Database Reference
In-Depth Information
GROUP BY word . The reason to do this is because there are many rows
in the Shakespeare table for each word—one for each corpus that used that
word. There are also multiple rows in the Wikipedia table for each title—one
for each revision to the entry.
If you had just done the JOIN without a GROUP BY , you'd get one row
for each corpus containing the word multiplied by the number of revisions.
The word “Peace” is in 33 of Shakespeare's plays and the “Peace” Wikipedia
entry has 2741 revisions, so joining just those two keys would generate 33 x
2741= 90,453 rows in the result—just for that one word. Because this isn't
what you want, you should do the GROUP BY on the Shakespeare table so
that you collapse the Hamlet entries down to one row, and you end up with
only 2 million rows in the result.
Here is how this query executes:
1. The mixer receives the query. This is a two-stage query; you have to
compute the results of the subquery on the Shakespeare table in order
to find and prepare the RIGHT table of the JOIN . To kick off the first
stage, the mixer extracts the subquery SELECT word FROM
[publicdata:samples.shakespeare] GROUP BY word and
sends it to the shards.
2. Each shard receives the subquery, reads the Shakespeare table, and
performs the GROUP BY aggregation. Each shard then returns one row
per Shakespeare corpus.
3. The mixer completes the aggregation from the shards' responses, and
ends up with an in-memory table containing one row per corpus. It
names that result table [__inline_table] so that it can be referred
to later.
4. Once the results of the subquery have been computed, the mixer sends
the following query to the shards: SELECT wiki.title FROM
[wiki_table_path] AS wiki JOIN [__inline_table] AS
shakes ON wiki.title = shakes.word . It also sends the inline
result table along with the query.
5. Each shard gets this query and the inline table computed in the
subquery. It takes the inline table and generates a lookup table from it.
The keys of the lookup table are the Shakespearean words and the
Search WWH ::




Custom Search