Understanding the BigQuery Object Model - Google BigQuery Analytics

Database Reference

In-Depth Information

Job Configuration

The job configuration section specifies what should get run. BigQuery may

tweak the configuration—it might canonicalize path names, for

example—but after the job has been created (that is, Jobs.insert() has

returned successfully) the configuration will never be changed.

There are four types of jobs: Query, Load, Copy, and Extract. Every query

that you run is a Query job . Load jobs import data from outside of BigQuery.

Copy jobs make fast copies of tables. Extract jobs can be used to make entire

tables available outside of BigQuery.

The job configuration has a subsection for each type of job that can be

run. The presence of the particular subsection is the signal for BigQuery to

run that particular type of job. Only one per-job-type configuration section

should be present at a time. For example, a Query job configuration may

look like this:

{"query": {"query": "SELECT 17"}}

Alternately, a Load job configuration may look like this:

{"load": {

"sourceUri": "gs://foo/bar.csv",

"destinationTable": {

"projectId": "bigquery-e2e",

"datasetId": "logs",

"tableId": "latest"}}}

Don't worry about the exact fields that are present here. Individual job types

are discussed in more detail in subsequent chapters: Query jobs in Chapter

7 (“Running Queries”), Load jobs in Chapter 6, Copy jobs in Chapter 10

(“Advanced Queries”), and Export jobs in Chapter 11 (“Managing Data

Stored in BigQuery”). For now, we will limit discussion to a few settings

that are common across job types: dryRun , createDisposition , and

writeDisposition .

Dry Run

The dryRun flag is one of the only settings present on the top level of the

job configuration, and its purpose is to instruct BigQuery to not actually run

Search WWH ::

Custom Search

Home