Database Reference
In-Depth Information
Jobs.getQueryResults() RPC
There are two limitations to the
Jobs.query()
API: Sometimes queries
run longer than the timeout you specify, and sometimes queries return
more data than you can read in a single page of results. The
Jobs.getQueryResults()
API addresses both of these issues by giving
you a mechanism to pick up where
Jobs.query()
left off.
When you run the original
Jobs.query()
, it returns three important
pieces of data: a
jobId
that can be used to look up information about the
Query job, a
jobComplete
flag that tells you whether the query completed
within the timeout value, and a
pageToken
that can let you page through
additional results (if there are any).
After you have the
jobId
from the
Jobs.query()
result, you can use
it to call
Jobs.getQueryResults()
. The result format of
Jobs.getQueryResults()
is identical to
Jobs.query()
. If the query
still isn't done, the
jobComplete
flag will still be false. If the query does
complete within the timeout, the first page of results will be returned, along
with a
pageToken
that lets you read more results.
You can call
Jobs.getQueryResults()
on any query job, not just one
that was run via
Jobs.query()
. This can be useful because the waiting is
done on the server side, so you'll get a response as soon as the query has
completed. That is, since the
Jobs.getQueryResults()
API waits for the
query to finish (or timeout), you don't need to add a sleep operation in your
code; all of the waiting occurs during the API call. It also does one fewer API
call because you don't have to wait for the query to complete before reading
the results—the results are returned as soon as they are ready.
Listing
7.1
demonstrates
the
use
of
and
Jobs.query()
Jobs.getQueryResults()
to run a query and fetch all the results.
Listing 7.1
:
Running a query via Jobs.query() and polling for
results with Jobs.getQueryResults() (query.py)
import auth
import pprint
import sys
def print_results(schema, rows):
''' Prints query results, given a schema. '''