Database Reference
In-Depth Information
SQL commands. Creating and updating tables and adding data to existing tables is
accomplished via API calls rather than SQL commands.
As with a relational database, data in BigQuery is structured into tables. These
tables are organized into groups known as datasets. In turn, each dataset belongs to a
project. Because the service is completely hosted, BigQuery's billing model is associ-
ated with the project level. The BigQuery service also enables project owners to share
datasets with users outside of the project. In this case, users from other BigQuery
projects could run queries on the public datasets that you create; they would cover the
costs of the query they run on your public dataset, and your project would be billed
only for storage of the data. BigQuery also supports a notion of access control lists, or
ACLs. Access to your datasets can be scoped down into whatever granularity is appro-
priate: private to one person, shared only with the members of a domain, or a combi-
nation of sharing permissions.
BigQuery's Query Language
Asking questions about the data stored in a relational database almost always means
that you will be writing queries in SQL, using syntax that is similar to an established
standard such as SQL-92. BigQuery uses an SQL-like syntax, but it doesn't support
all, or even most, of the functions available in SQL-92. Some of the differences are
due to the fact that BigQuery is a query engine, not a transactional database, so it lacks
the SQL syntax necessary to create or update individual records.
One of BigQuery's primary use cases is to provide fast aggregate query results over
large tables of data. Many times, a GROUP BY clause is used together with an ORDER BY
clause to produce a count of TOP results. This type of query is so common that Big-
Query provides a shortcut method called TOP for this purpose. Let's look at one of
BigQuery's current sample tables, called “wikipedia,” which contains over 300 million
rows of Wikipedia revision history information. Listing 6.1 provides an example of
using the TOP method to produce an ordered list of the top five most revised pages
that contain the term “data.”
Listing 6.1 BigQuery's TOP function
/* BigQuery TOP() function: Combines the functionality
of GROUP BY, LIMIT and ORDER BY
*/
SELECT
TOP(title, 5),
COUNT(*)
FROM
[publicdata:samples.wikipedia]
WHERE
title CONTAINS “data”;
BigQuery also supports data structures that have nested and repeated fields.
This means that a particular field may define a new row containing a set of child
 
 
Search WWH ::




Custom Search