Database Reference
In-Depth Information
Parallel Lists
When you design schemas for BigQuery, it is best to organize repeated data
so that fields that repeat together are collected into a RECORD type field
that is repeated; this allows for simpler queries over the data. Unfortunately,
there are times when you have to deal with data that is structured as
independent lists where elements at equal indexes are related to each other.
For example, consider this table representing mappings between numbers
in different representations.
[
{"name": "src_id", "type": "string"},
{"name": "dst_id", "type": "string"},
{"name": "src", "type": "string", "mode":
"repeated"},
{"name": "dst", "type":"string", "mode": "repeated"}
]
Ideally, you would have preferred a schema that paired the source and
destination values into a single record.
[
{"name": "src_id", "type": "string"},
{"name": "dst_id", "type": "string"},
{"name": "map", "type": "record", "mode":
"repeated", "fields": [
{"name": "src", "type": "string"},
{"name": "dst", "type": "string"}
]}
]
Fortunately, if you are stuck with the former schema, you can still arrange
for the pairs to be constructed on-the-fly.
You can use the POSITION function to access the index of each value, but
you need to include a FLATTEN operation because the fields accessed are
independently repeating.
SELECT src_id, dst_id, src, dst
FROM FLATTEN((
Search WWH ::




Custom Search