Database Reference
In-Depth Information
Data Ingestion
Data used in BigQuery must be loaded into the system before it can be
queried. The load process transforms your data into a format that is
optimized for querying and stores it in locations in physical proximity to the
Dremel compute clusters.
There are three ways to get your data into BigQuery: streaming, direct
upload, and through Google Cloud Storage. The most reliable and
predictable is likely the latter. If your data is already in Google Cloud
Storage, the load step is merely a transfer between two systems already
within Google's cloud, so ingestion is very fast.
Direct upload can be an easier route if you don't want to go through Google
Cloud Storage, because you can follow a standard resumable-upload HTTP
protocol. Streaming is the easiest method; you can post individual rows,
which will be available for query immediately. That said, for large load
operations, or cases in which you want all your data to be available
atomically, streaming may not be the best mechanism. For more
information about how to get data into BigQuery, Chapter 6, “Loading
Data,” describes the various options in detail.
Structured Data Storage
BigQuery is a system that stores and operates on structured data; that is,
data that follows a rigid schema . A spreadsheet is an example of structured
data, as is a database table. An HTML document, even though it may have
predictable fields, is unstructured. If your data doesn't have a schema, or
can't be coerced to a schema, there may be other tools that are better-suited
for your use case.
BigQuery schemas describe the columns, or fields , of the structured data.
Each field has a name and a data type that indicates the kind of data that
can be stored in the field. Those data types can be either primitive or record
types. Primitive types are basic types that store a single value—a string, a
floating-point number, an integer, or a boolean flag.
A record type, however, is a collection of other fields. For the most part, a
record is just a way of grouping your fields together. For example, if you
store location as latitude and longitude, you could have a location record
with two fields: lat and long . Fields can also be repeated, which means
that they can store more than one value.
Search WWH ::




Custom Search