Database Reference
In-Depth Information
Harnessing the Power of Google's Cloud
When you run your queries via BigQuery, you put a giant cluster of machines
to work for you. Although the BigQuery clusters represent only a small
fraction of Google's global fleet, each query cluster is measured in the
thousands of cores. When BigQuery needs to grow, there are plenty of
resources that can be harnessed to meet the demand.
If you want to, you could probably figure out the size of one of BigQuery's
compute clusters by carefully controlling the size of data being scanned in
your queries. The number of processor cores involved is in the thousands,
the number of disks in the hundreds of thousands. Most organizations don't
have the budget to build at that kind of scale just to run some queries over
their data.
The benefits of the Google cloud go beyond the amount of hardware that
is used, however. A massive datacenter is useless unless you can keep it
running. If you have a cluster of 100,000 disks, some reasonable number of
those disks is going to fail every day. If you have thousands of servers, some
of the power supplies are going to die every day. Even if you have highly
reliable software running on those servers, some of them are going to crash
every day.
To keep a datacenter up and running requires a lot of expertise and
know-how. How do you maximize the life of a disk? How do you know
exactly which parts are failing? How do you know which crashes are due to
hardware failures and which to software? Moreover, you need software that
is written to handle failures at any time and in any combination. Running
in Google's cloud means that Google worries about these things so that you
don't have to.
There is another key factor to the performance of Google's cloud that some
of the early adopters of Google Compute Engine have started to notice:
It has an extremely fast network. Parallel computation requires a lot of
coordination and aggregation, and if you spend all your time moving the
data around, it doesn't matter how fast your algorithms are or how much
hardware you have. The details of how Google achieves these network
speeds are shrouded in secrecy, but the super-fast machine-to-machine
transfer rates are key to making BigQuery fast.
Search WWH ::




Custom Search