Loading Data - Google BigQuery Analytics

Database Reference

In-Depth Information

Google Cloud Storage

A useful way to think about the role of Google Cloud Storage is to compare

it to the file system on your personal machine. Effectively, GCS is the file

system for the Google Cloud Platform. Every component of the platform,

including BigQuery, supports reading files stored in GCS. Unless you use

GCS as part of your application platform, your data will not be hosted in the

service. However, there are a couple of compelling reasons to use GCS for

transferring data to BigQuery.

• Robust tools and APIs for uploading data

• Simple BigQuery integration

• Cost effective data archival and backup solution

The drawback is that you have to pay for storing the data in GCS until you

load it into BigQuery, which can be wasteful if you already store your data in

a different location.

With GCS you have already completed the heavy lifting of moving the bytes

representing your data into the Google Cloud Platform even before initiating

the API call to BigQuery. This is accomplished via the GCS API

( https://developers.google.com/storage/docs/overview ) or

more simply using one of the client tools ( gsutil , browser application).

GCS objects are arranged according to a two-level naming scheme: a

top-level bucket name and an object name. Bucket names are globally

unique in the service, and object names are unique within a bucket. When

using the gsutil command-line tool to access a file stored in GCS, you will

use a URI of the form:

gs://< bucket >/< object >

BigQuery expects URIs in the same format when referencing GCS files in a

load job. Here is the code snippet to configure GCS locations in a load job:

loadConfig['sourceUris'] = [

'gs://bigquery-e2e/chapters/06/sample.csv',

'gs://bigquery-e2e/chapters/06/sample_*',

]

You can see that a single load job can specify multiple GCS URIs and that

URIs can be wildcards . Following the terminology commonly used in shells,

Search WWH ::

Custom Search

Home