Database Reference
In-Depth Information
the sample, nothing would change if you simply left out the string_f entry
from the JSON object.
JSON supports just three primitive types: string, Boolean, and number.
Because BigQuery supports more types, a one-to-one mapping between
primitive types does not exist. BigQuery can accept a JSON string value for
any primitive field if it can parse the string value as the type declared in the
schema. For example, on line 4 of the sample, you can see that the boolean
and integer values are present as strings. These will be converted because
they can be parsed as the respective types. Note that the float_f field
happens to have an integer value. However, this does not affect how it ends
up being stored in BigQuery, namely the value 1.0. Finally, timestamp fields
are interpreted as they are in the CSV format. You can pass a JSON number
value that represents (integer or fractional) seconds since the UNIX epoch
or a human readable string with the format described in the CSV section.
Here, too, if you are passing seconds since the epoch, it is acceptable to pass
it as a string containing the number.
It is plain to see that JSON is a verbose format because the field names
appear in every record. In web applications it has replaced XML, which
is even more verbose, so it is generally seen as an improvement over that
format. However, for large data transfers the repetition of field names
imposes a significant byte cost. Fortunately, it turns out that GZIP is
effective at bringing this cost down. CSV and JSON representations of the
same data end up being about the same size after GZIP compression.
BigQuery supports GZIP compressed JSON, so you should definitely
consider compressing the data but bear in mind the constraint of keeping
each individual compressed file a reasonable size. The same guideline given
for CSV, 10-100 MB post compression is also reasonable for JSON.
AppEngine Datastore Backup
Datastore is the scalable NoSQL store that is part of the Google AppEngine
platform ( http://appengine.google.com/ ) . It is available within the
platform and is also accessible through a standalone HTTP API
( https://developers.google.com/datastore/ ) . Datastore
supports efficient single-record updates, lookup, and index-based scans.
However, scanning a large Datastore table is an expensive operation that
is significantly slower than running a query in BigQuery over a similar
sized table. The two services complement each other in their performance
Search WWH ::




Custom Search