Database Reference
In-Depth Information
The server receives a log record as a JSON object, which is handled by
RecordHandler.handle(…) . The one bit of logic that the code applies
is to direct the record to the daily table corresponding to the timestamp
in the record. It is significant that the code uses data from the record to
deterministically compute the destination table. If instead it used the system
time or some other value that could change independently of the record,
then it is possible for a given record to be inserted into different tables if the
client happened to retry the request due to an unexpected error receiving
the acknowledgment from the server. To protect against similar duplication
between the handler and BigQuery, it generates a globally unique
insertId , assuming at most one record per second from each device, by
concatenating the device ID and timestamp.
Since the client is shipping a JSON object that conforms to the schema
of the BigQuery table, it can directly be added as the json field of the
insert record. The server is a trampoline for the mobile logs because it
bounces the record from the client with no transformations into a BigQuery
operation. This implementation tends to be straightforward but suffers from
the drawback that you will be paying for the App Engine resources required
to process these records. If you have a large volume of data flowing to
BigQuery, the resources required can be substantial. In the next section on
how authentication works on App Engine, you look at a scheme to avoid
passing the data through App Engine.
Before wrapping up this section on saving the logs, consider how these daily
tables are created. After all, the streaming insert operation requires that the
tables already exist. The function to create a set of daily tables for upcoming
days is fairly simple and shown in Listing 8.5 . It attempts to create 3 tables
between 2 and 4 days ahead of the current time. The advantage of creating
multiple consecutive tables is that it automatically incorporates retries on
failures. If you run this handler every day, then it attempts to create a given
table 5 times before it is required. Of course, creating a table for a given
day on that same day is too late since insertions will already be failing.
However, it is convenient for initial setup. Also take a look at how the
expiry time is set up on the table. Note that the expiry time is calculated
relative to the time it will start to receive records rather than its creation
time. Keeping the expiry aligned with the UTC day boundary also makes
the lifetime more predictable. Tables vanishing at arbitrary times within a
 
Search WWH ::




Custom Search