Database Reference
In-Depth Information
TableReader . Finally, it waits for all of the threads to complete. Following
is an example of running the indexed table reader to read your favorite table,
publicdata:samples.shakespeare in three parallel threads:
$ python
>>> import tabledata_index
>>> tabledata_index.parallel_indexed_read(
3, 'publicdata', 'samples', 'shakespeare' ,
'/tmp/bigquery')
publicdata:samples.shakespeare last modified at
1335916045099
Reading [0-54885)
Writing results to /tmp/bigquery/shakespeare.0
Reading [54885-109770)
Writing results to /tmp/bigquery/shakespeare.1
Reading [109770-164655)
Writing results to /tmp/bigquery/shakespeare.2
Read 54885 rows at 54885
Read 54885 rows at 109770
Read 54885 rows at 0
Time Range Decorators
Another way to split up a table is to use a time range decorator, which allows
you to read only data that was added to a table during a particular time
range, for example:
publicdata:samples.wikipedia@1386465812000-1386465899999
Time range decorators create a view of the table containing only the data
that was added between those two timestamps. Like a snapshot decorator,
the times used in time range decorators must be within the last 7 days.
How is reading only a time slice of data in a table useful when reading out
a table? It is useful because you might not have to read out the whole table.
Maybe you read the table yesterday at time T, so today you need to read only
the data that was added between T and now. If you had to read out the entire
table page by page it might take a long time, but the data that was added in
the last 24 hours might be much more manageable.
Search WWH ::




Custom Search