Database Reference
In-Depth Information
SELECT src_id, dst_id,
POSITION(src) AS src_index, src,
POSITION(dst) AS dst_index, dst
FROM [bigquery-e2e:ch10.parallel]),
src)
WHERE src_index = dst_index
This works well enough, but keep in mind this is quite a bit more expensive
than the simpler queries possible with the more appropriate schema. The
execution engine has to generate the cross product of the elements in the
list before it can filter it down to matching the entries with the matching
positions. Just to demonstrate that you can actually do something useful
with the result of this query, here is a version that does a little more than
pulling the entries together.
SELECT src,
GROUP_CONCAT(CONCAT(dst_id, CONCAT(":", dst))) AS
dst_list
FROM FLATTEN((
SELECT
src_id, dst_id,
POSITION(src) AS src_index, src,
POSITION(dst) AS dst_index, dst
FROM [bigquery-e2e:ch10.parallel]),
src)
WHERE src_index = dst_index
AND src_id = "decimal"
GROUP BY src
ORDER BY src
This approach generalizes to more than two parallel lists. Each additional
repeated field requires a nested FLATTEN . You should also place the WHERE
clauses, applying the position equality filter in the nested queries, rather
than collecting them all at the top-level to reduce the number of
intermediate rows generated by the cross product. That means one such
condition following each FLATTEN invocation.
Search WWH ::




Custom Search