Advanced Queries - Google BigQuery Analytics

Database Reference

In-Depth Information

SUM(word_count) AS corpus_total

FROM [publicdata:samples.shakespeare]

WHERE LENGTH(word) > 10

GROUP BY word, word_len, corpus)

ORDER BY word_len DESC

LIMIT 100

This query does not quite do what we set out to do because each row

corresponds to a word and corpus pair. We will fix it as we pivot the data.

If you are only interested in a specific set of his works, it would be more

convenient if you could have a column for the count of the word in each

of the works of interest. In addition, you can then ensure that you collect

100 separate words because each row will correspond to a single word. You

can achieve this by adding multiple aggregation columns with conditional

expressions.

SELECT

word,

SUM(IF(corpus = 'kinglear', corpus_total, 0)) AS

kinglear ,

SUM(IF(corpus = 'tempest', corpus_total, 0)) AS

tempest ,

SUM(IF(corpus = 'macbeth', corpus_total, 0)) AS

macbeth ,

SUM(corpus_total) AS [total]

FROM (

SELECT

word,

LENGTH(word) word_len,

corpus,

SUM(word_count) corpus_total

FROM [publicdata:samples.shakespeare]

WHERE LENGTH(word) > 10

GROUP BY word, word_len, corpus)

GROUP BY word, word_len

ORDER BY word_len DESC

LIMIT 100

Search WWH ::

Custom Search

Home