Database Reference
In-Depth Information
If you did buy and build your own compute cluster so that you could get
BigQuery-like performance, you probably wouldn't be able to run it at
capacity—it would probably get much less usage overnight and on the
weekends for example. If you could run it at capacity, you'd likely have
times of day when demand outstripped supply, and then you'd be sacrificing
performance.
Keeping a giant compute cluster around in order to run a few queries once
in a while seems wasteful. One of the key concepts of the BigQuery query
engine is multitenancy. That is, multiple queries from multiple different
users are all running at once. By multiplexing usage across multiple
customers, who are all on different schedules and with different data usage
patterns, BigQuery can keep its hardware running at a high-level of
utilization. Also, BigQuery can easily grow and shrink its capacity by taking
advantage of extra resources within a datacenter (that is, from other Google
services), or by spinning up new clusters in other datacenters.
Analytics as a Service (AaaS?)
There are a lot of acronyms in Cloud Computing, from IaaS (Infrastructure
as a Service) to PaaS (Platform as a Service). BigQuery, if you were going
to give it a similar acronym, could be called Analytics as a Service (AaaS).
We're not particularly excited about this moniker catching on, but as a
description, it is quite apt.
BigQuery is a service that you use to perform your analytics tasks. It
operates at a higher level than most other Big Data analytics offerings. For
example, tools such as Impala and Presto require you to manage your own
virtual hardware and your own data. Even Amazon Redshift, although it is
hosted, requires you to manage a database instance.
Global Data Namespace
One advantage to performing your analytics in the cloud is that it becomes
easy to share data without moving it around. All BigQuery tables sit in
the same namespace. This may seem like a minor detail, but it is actually
extremely useful. Every table in BigQuery can be joined against every other
table in BigQuery, as long as the user running the query has access to both
tables. This means that if someone publishes a table with weather data, you
can join that weather table against your sales data to determine how the
weather affects your sales. There are a number of public datasets that have
Search WWH ::




Custom Search