Database Reference
In-Depth Information
shuffle partitions and 7146922576 % 100 = 76. After the shuffle is
complete, the shards return to the mixer indicating they are finished.
(Note that they do not return any results at this point.) The temporary
table with the results is given the name __table1 .
3. Once the Shakespeare table rows have been sorted by word value into
distinct partitions, the GROUP BY operation from the RIGHT subquery
can be done entirely in the shards. The shards receive the query SELECT
word FROM [__table1] GROUP BY word . The results of this
subuery are stored in table __table2 .
4. The mixer now kicks off a new shuffle on the left table by instructing the
shards to shuffle the [publicdata:samples.wikipedia] table into
500 partitions based on the title field. The results are saved in a
temporary table called __table3 .
5. Now that all the preparatory work is done, the two temporary tables can
be joined. Because the LEFT and RIGHT tables have been sorted by the
field they are being joined by, the work of the JOIN can be done in
parallel on the shards. The mixer sends SELECT wiki.title FROM
__table3 AS wiki JOIN __table2 AS shakes ON
wiki.title = shakes. word to the shards. The __table2 and
__table3 tables are the outputs of the shuffle phases that have been
patiently waiting to be joined. After computing the JOIN , the results are
returned to the mixer.
6. The mixer doesn't need to do any aggregation in this query, so it just
returns all the results to the caller.
Search WWH ::




Custom Search