Database Reference
In-Depth Information
If there was no massive distributed filesystem, running a query over a large
dataset would likely be slower than on a traditional relational database
system.
Although there certainly would be ways of partitioning the system so that
it could work with HDFS or other open source tools, this would be a large
undertaking. The Google technology stack, for better or worse, is
specialized, and it would be a huge undertaking to separate out the
technology-independent portions of it.
Even if BigQuery (or the Dremel query engine) was open source, you'd
still need a place to run it. Dremel is most useful when run on hundreds
or thousands of machines. A large portion of the value of BigQuery is in
providing a slice of a huge managed compute cluster. While you could
run this on your own hardware or in another vendor's cloud, it would be
expensive and have considerable service-management overhead.
We hope that in the future we will be able to open source portions of the
system as Google's Cloud Platform releases more of the building blocks used
by Google's internal systems to the outside world. In the meantime, users
who want an open source alternative should consider the Apache Drill open
source Dremel project, which aims to be compatible with BigQuery's SQL
dialect and API.
BigQuery Technology Stack
Google has an extremely comprehensive and impressive set of internal
infrastructure tools, many of which, such as Spanner, Megastore, and GFS,
have been disclosed in research papers. Some of these tools, such as Bigtable
and GFS, have open source versions. Users often wonder how BigQuery
relates to these technologies: Are BigQuery tables Bigtables, for example? Is
user data stored in GFS?
This section attempts to answer, at a high level, how BigQuery relates to
the Google infrastructure stack. Chapter 9 goes into more detail about the
architecture; if you're interested in how these systems work, you may want
to skip ahead. If Chapter 9 isn't enough detail for you, it provides references
to the research papers that Google has published on the underlying
technologies.
Search WWH ::




Custom Search