Database Reference
In-Depth Information
Intermediate Results
As you have seen, query results are tables, so they can be used as sources
for further queries. This feature can be useful in a variety of application
scenarios. For example, consider the problem of exploring the mobile
records collected according to the hour and state in which they were
collected. Potentially interesting statistics are:
• Counts by hour summed over all states
• Relative distribution by state in a particular time range
• Hourly distribution for a given state
Running any one of these queries on the base data each time a page is
rendered can be expensive. Because each result would require a unique
query, a simple cache would not be effective. But a single intermediate
table can make all these queries have a fixed, lower cost independent of
the underlying table size. You can construct this intermediate table with the
following query:
SELECT
DATE(l.ts) Day,
HOUR(l.ts) Hour,
z.state State,
COUNT(l.ts) Num
FROM [bigquery-e2e:ch11.sample_device_log] l
INNER JOIN [bigquery-e2e:reference.zip_codes] z
ON z.zip=l.location.zip
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3;
The result table is not suitable for direct rendering because it has 7 × 24 ×
50 = 8400 rows but is small enough that the cost of a query on the table is
negligible. A further rollup query, one that performs additional aggregation,
on this result table can be used to generate a summary view. Filter queries
that select a subset of rows corresponding to specific states and time ranges
can be used to explore details. Also note that identical queries referencing
this table will benefit from the auto-caching feature because this result
table is effectively frozen until the query that generated it is run again. By
virtue of this feature, frequently accessed graphs will end up being cached
and returning immediately. It is also worth noting that the size of this
Search WWH ::




Custom Search