Database Reference
In-Depth Information
$ bq --job_id=${JOB_ID} \
load scratch.table2 temp.csv
"f1:integer,f2:float,f3:string"
Waiting on bqjob_r11463cdf65f08230_00000140037462c6_1
… (36s) Current
status: DONE
$ bq --format=prettyjson show -j ${JOB_ID} | grep
outputBytes
"outputBytes": "21",
Here you can see that 21 bytes were loaded—that includes 8 bytes for each
of the two numeric fields, plus 3 bytes for “foo” plus 2 bytes for the
null-terminator.
Processing Costs
BigQuery charges you for the number of bytes scanned by a query. This is
roughly proportional to the amount of work that BigQuery does for each
query because all queries are essentially table scans—that is, they must read
all the rows in the table.
Each column in a table is stored separately, however, so BigQuery needs
to read only the columns that are directly referenced by the query. This
selectivity can make it somewhat difficult to know how much a given query
will cost, especially because the number of bytes per column is not exposed
to users, only the total number of bytes in the table.
Luckily, there is a mechanism to determine query cost (in bytes processed):
running the query in dry run mode. Dry run reports the amount of resources
that would be used by a job but does not actually run the job. This comes
in handy to figure out how much a query would cost. The command shown
here uses bq in dry run mode to find out how much it would cost (in bytes)
to query over the title field of the public Wikipedia table:
$ bq query --dry_run --format=prettyjson \
"select title from publicdata:samples.wikipedia" \
| grep totalBytesProcessed
"totalBytesProcessed": "7294285723"
The lazy (and perhaps spendthrift) way to determine the cost of the query
is just to run the query. As mentioned in a previous section, the
Search WWH ::




Custom Search