Database Reference
In-Depth Information
PhoneLog(record_id)
);
You can see that the second table declaration looks a lot like the nested
record schema definition, but instead of placing it within an outer schema,
a foreign key relationship is declared. Now look at a simple query over these
two tables that fetches the importance level of a given application within a
particular ZIP code.
SELECT MAX(running.importance_level)
FROM PhoneLog
NATURAL JOIN Application running
WHERE location_zip = '98107'
AND running.name =
'com.googlecode.bigquery_e2e.sensors.client'
The
NATURAL JOIN
operator combines the two tables to generate one
larger table that has the columns from both tables and the same number of
rows as the
Application
table. Because one record from the
PhoneLog
matches zero or more rows in the
Application
table, its values are
repeated for each matching row. Now perform the same operation with the
nested schema.
SELECT MAX(running.importance.level)
FROM [bigquery-e2e:ch10.sample_data]
WHERE location.zip = '98107'
AND running.name =
'com.googlecode.bigquery_e2e.sensors.client'
Because we were careful in choosing the aliases, the query looks almost
identical to the query over the separate tables, but now the
NATURAL JOIN
is implicit. Just as
NATURAL JOIN
repeated the values for the fields from
the left side (
PhoneLog
) table, the values in the top-level fields are repeated
once for each value in the application ID field.
You have probably realized where this discussion is leading. The
NATURAL
JOIN
operation and repeated fields are nearly equivalent. In fact,
sophisticated database storage engines allow you to configure your tables
so that the records of one table are interleaved between the records of the