Database Reference
In-Depth Information
This query returns the most commonly used words (lowercased) longer
than five letters, the count, and the corpus in which they appear, in any
of Shakespeare's plays with “king” in the title. It doesn't, however, count
frequency across plays; it just returns the count of the word per-play. Here
are the results:
+------------+-----------+----------------+
| word | frequency | corpus |
+------------+-----------+----------------+
| falstaff | 199 | 2kinghenryiv |
| prince | 192 | 1kinghenryiv |
| richard | 188 | kingrichardiii |
| gloucester | 182 | kingrichardiii |
| edward | 181 | 3kinghenryvi |
| falstaff | 168 | 1kinghenryiv |
| gloucester | 141 | kinglear |
| richard | 134 | kingrichardii |
| warwick | 122 | 3kinghenryvi |
| cardinal | 121 | kinghenryviii |
+------------+-----------+----------------+
Now let's walk through the query, line by line:
SELECT LOWER(word) AS word, word_count AS frequency,
corpus
All SQL queries that return data (and thus all BigQuery queries) start with
the word SELECT , indicating that you're selecting data out of a table. After
SELECT , you have comma-delimited field projections. These can be the
raw fields (like word_count ), computed values (like LOWER(word) , which
transforms word to lowercase), or aggregation functions (see the next
query). You can also decide what you want the name of the field to be
by using the AS keyword. For example, word_count AS frequency
renames word_count to frequency in the output. Fields with computed
values get assigned a unique field name, like f0_ , which you may have seen
in query results earlier in this chapter. In this query when we lowercased
the word field, we also assigned an alias, so that it would get a user-friendly
name, rather than f0_ .
Search WWH ::




Custom Search