Database Reference
In-Depth Information
SELECT wikipedia.title
FROM [publicdata:samples.wikipedia] wikipedia
CROSS JOIN (
SELECT corpus
FROM [publicdata:samples.shakespeare]
GROUP BY corpus) AS shakespeare
WHERE wikipedia.title CONTAINS shakespeare.corpus
GROUP BY wikipedia.title
IGNORE CASE
The reason this query needed to be a CROSS JOIN was because the effective
JOIN condition was that one field contains another, ignoring case. There
is no ON clause that could specify the containment condition because ON
clauses must specify an exact match. Instead, the CROSS JOIN matched
each Wikipedia title with each Shakespeare corpus and returned only the
ones where the title contained the name of the Shakespeare play. If part of
the condition includes equality between columns, you should use an INNER
JOIN and move that condition to the ON clause since it will be executed more
efficiently. A CROSS JOIN is only required when no such equality condition
exists. In addition, if the JOIN clause appears in a subquery, you should
push filters into the WHERE clause of the subquery whenever feasible.
Semi-JOIN and Anti-JOIN
Semi- and anti- JOIN are fancy words for JOIN operations that you may
not have even recognized are JOIN s. If you want to specify a filter from a
list generated by a query, you can use a subquery with the IN clause. For
example, consider the following query, which returns long words from the
five largest works in the Shakespeare table.
SELECT word
FROM [publicdata:samples.shakespeare]
WHERE corpus IN (
SELECT corpus FROM (
SELECT corpus, SUM(word_count) total
FROM [publicdata:samples.shakespeare]
GROUP BY corpus ORDER BY total DESC LIMIT 5))
AND LENGTH(word) > 14
GROUP BY word
Search WWH ::




Custom Search