Database Reference
In-Depth Information
could not be created (in which case the work done to transfer the bytes is
lost); otherwise, it will contain the job created. Notice that in the example
we have base64-encoded the data in the request body and added the
Content-Transfer-Encoding header. This is not strictly necessary but
it can help mitigate issues with problematic HTTP proxies.
When using the client library, you select this mode of operation by simply
leaving out resumable=True or setting it explicitly to False . However,
there is no good reason to use this mode when working through the client
libraries. The resumable flag defaults to False to maintain backward
compatibility with earlier versions of the client library and APIs that do
not support the Resumable Upload protocol. If for some reason you cannot
use the client library in your application, it is reasonable to implement
this multipart method rather than implement the full Resumable Upload
protocol, which requires more complicated code. Just be aware that you may
encounter issues with failed HTTP requests trying to upload large amounts
of data with this approach. The problem is that the likelihood of a random
failure affecting a request increases with the size of a request. So a very large
request can fail frequently due to intermittent network failures.
That covers the different options you have for moving your data into
BigQuery. If you primarily work with installed software, it may seem odd to
move the data rather than move the software. On the other hand, if you are
familiar with cloud-based services, the process of moving your data into the
cloud will feel natural. This section has covered three separate methods for
transferring data. Google Cloud Storage is ideal if you would like to retain a
backup copy of your data outside of BigQuery. Once the data is uploaded to
GCS, importing it into BigQuery is simply a matter of referencing the files.
If you only need the data to be stored in BigQuery, the Resumable Upload
protocol is the best choice because it allows for large amounts of data to
be transferred robustly. Finally, if simplicity or minimizing the number of
HTTP requests is the most important consideration, you can use multipart
requests, but be aware that this method may not scale well to large data
sizes.
Destination Table
Now you need to control where the data you are loading ends up inside
BigQuery. The load job configuration specifies a single destination table and
optionally includes a schema for the table. The destination table can live in
Search WWH ::




Custom Search