Database Reference
In-Depth Information
These are metadata files describing the contents of the backup. The first file
stores information specific to the backup of a single kind. The second file
stores information about the backup that is common across all the kinds.
With the backup completed you are now ready to load this data into
BigQuery.
Loading a Backup
Chapter 6 mentioned the capability to load Datastore data in the context of
explaining the sourceFormat option in the load job configuration. Now
we are in a position to explain the full details. A BigQuery load job with the
sourceFormat set to DATASTORE_BACKUP expects a single source URI,
the backup metadata file for the kind that needs to be imported. Note that
the metadata file for the overall backup is not relevant here. In addition you
need to specify the destination table that will receive the contents of the file.
You can perform the operation using the command-line client.
$ bq mk ch11
$ BACKUP_PATH='gs:// bigquery-e2e/data/backup/
datastore/001'
$ BACKUP_HANDLE='. . .'
$ bq load --source_format=DATASTORE_BACKUP \
ch11.devices \
${BACKUP_PATH}/${BACKUP_HANDLE}.Device.backup_info
For the most part this looks like a regular load job, but there are a couple of
caveats:
• The WRITE_APPEND write disposition is not supported.
• The schema is not specified in the job.
The write disposition restriction is primarily to avoid operator error because
it is usually not sensible to load multiple copies of a Datastore backup
into the same table. The second restriction is more interesting. Because
Datastore is a NoSQL store, the schema for the destination table has to
be derived. The backup metadata contains the full set of fields (and their
types) encountered while generating the backup. BigQuery uses this data to
generate a schema that can hold all the entities in the backup. Hence the
user is not permitted to specify a schema in the job configuration.
Search WWH ::




Custom Search