Database Reference
In-Depth Information
create files in your output GCS bucket. To grant access, substitute your own
AppEngine ID for APP_ID and your own GCS bucket for GCS_BUCKET and
issue the following commands:
$ APP_ID=bigquery-mr-sample
$ GCS_BUCKET=bigquery-e2e
$ gsutil acl ch \
- u ${APP_ID}@appspot.gserviceaccount.com:W \
gs://${GCS_BUCKET}
Updated ACL on gs://bigquery-e2e/
You can run a simple MapReduce with almost no additional code beyond
the previous script. You just need to set some configuration parameters
and save them in a file called mapreduce.yaml at the top level of the
AppEngine project directory. Here is an example mapreduce.yaml file:
mapreduce:
- name: Add Zip Codes
mapper:
handler: add_zip.apply
input_reader:
mapreduce.input_readers.FileInputReader
output_writer:
mapreduce.output_writers._GoogleCloudStorageOutputWriter
params_validator: validator.adjust_spec
params:
- name: files
value: /gs/bigquery-e2e/chapters/12/
add_zip_input.json
- name: shards
default: 1
- name: format
default: lines
- name: output_bucket
default: bigquery-e2e
Search WWH ::




Custom Search