Database Reference
In-Depth Information
The next line is:
FROM [publicdata:samples.shakespeare]
After the selected field list comes the FROM clause, which instructs the query
engine where to find the data. Fully specified BigQuery table names are
designated by project_id_or_number:dataset_name.table_name .
That said, you often don't need to use the fully specified name. The project
ID defaults to the project that runs the query. If you don't like specifying the
dataset name either and are using the API (as opposed to the query UI), you
can set a default dataset in the job query configuration.
You may have noticed the funny brackets around the table name that aren't
there in standard SQL. These are generally optional, but some table names
require these quote characters to parse correctly. For example, if the table
name was shakespeare-plays rather than shakespeare , the query
parser would have a difficult time telling whether this was a subtraction
operation ( shakespeare minus plays ) or a table name. To prevent
parsing ambiguities, it is usually safest to just include the brackets.
Moving to the next line in the query, you have:
WHERE corpus CONTAINS 'king' AND LENGTH(word) > 5
This line contains the WHERE clause, which enables you to filter which rows
are returned. In this case, the filter returns only rows where the corpus
field contains “king” and the word field is longer than five letters. You can
also call most functions here; the test for words longer than five letters
in the name could have been written as REGEXP_MATCH(word,
'\\w{5}\\w+') .
One of the last pieces of a query is the optional ORDER BY clause, which
enables you to choose the sort order of the query results:
ORDER BY frequency DESC
One trick that can come in handy is to order by the index of the field in the
SELECT clause. In this case, frequency is the second field mentioned, so
this could have been written as ORDER BY 2 DESC .
Search WWH ::




Custom Search