Database Reference
In-Depth Information
create files in your output GCS bucket. To grant access, substitute your own
AppEngine ID for
APP_ID
and your own GCS bucket for
GCS_BUCKET
and
issue the following commands:
$
APP_ID=bigquery-mr-sample
$
GCS_BUCKET=bigquery-e2e
$
gsutil acl ch \
-
u ${APP_ID}@appspot.gserviceaccount.com:W \
gs://${GCS_BUCKET}
Updated ACL on gs://bigquery-e2e/
You can run a simple MapReduce with almost no additional code beyond
the previous script. You just need to set some configuration parameters
and save them in a file called
mapreduce.yaml
at the top level of the
AppEngine project directory. Here is an example
mapreduce.yaml
file:
mapreduce:
- name: Add Zip Codes
mapper:
handler: add_zip.apply
input_reader:
mapreduce.input_readers.FileInputReader
output_writer:
mapreduce.output_writers._GoogleCloudStorageOutputWriter
params_validator: validator.adjust_spec
params:
- name: files
value: /gs/bigquery-e2e/chapters/12/
add_zip_input.json
- name: shards
default: 1
- name: format
default: lines
- name: output_bucket
default: bigquery-e2e