Database Reference
In-Depth Information
Concretely, we are going to transform a BigQuery table with latitude and
longitude fields into a new table that has all the original fields, plus an
additional field with the ZIP code that is the best match for the record.
To avoid distractions we will use a simple source table with the following
schema:
[
{"name": "id", "type": "string"},
{"name": "lat", "type": "float"},
{"name": "lng", "type": "float"}
]
The transformed table will have the same schema with one additional field:
[
{"name": "id", "type": "string"},
{"name": "lat", "type": "float"},
{"name": "lng", "type": "float"},
{"name": "zip", "type": "string"}
]
Now that you have a well-specified problem, you can move on to finding a
solution.
Sequential Solution
The most straightforward way to solve the ZIP-code assignment problem
would be to:
1. Export the data from the table to a file on GCS.
2. Download the data to a local file.
3. Run a custom program that transforms the file.
4. Load the transformed file into the new BigQuery table.
Steps 1, 2, and 4 have been covered in detail in the first section and in
Chapter 6. Although step 3 is not actually specific to BigQuery, the details of
how to construct the program to transform the data are going to be relevant
to how to run it in the AppEngine MapReduce framework. Listing 12.7
shows how to solve the problem if you deal with data that is small enough
 
Search WWH ::




Custom Search