Database Reference
In-Depth Information
fast, slow, or impossible; understanding the difference in the architectures
can help you generate an intuition about when to use one tool versus
another.
Relational Databases
If you're doing any kind of analytics, you almost certainly have experience
using a relational database. Relational databases exist in many forms—from
Open Source examples such as MySQL and Postgres to commercial
juggernauts such as Microsoft SQL Server and Oracle.
In addition, if you've used a relational database, you've probably even
thought about optimizing one—adding indexes, normalizing data, adding
additional disk spindles, and so on. Relational database architecture tends
to be almost hard-wired into how most people think about querying their
data, whether they realize it or not.
This section outlines the architecture of a typical relational database,
focusing on the parts that contribute most to its performance. You also see
when to use it and how it compares to BigQuery.
Relational Database Design
Relational databases have been around since the early 1970s—they have had
a lot of time to be refined and improved upon. That said, the fundamentals
haven't changed in that time: They still rely on the same underlying data
structures and apply the same types of optimization tricks as they have for
the past several decades.
Data Storage Architecture
The fundamental data structure in a relational database is the B-Tree. A
B-Tree is a specialized tree, usually stored on disk, with a relatively high
branching factor (unlike a binary tree, which has a branching factor of two).
Each node can point to a number of children, up to the branching factor,
which is often on the order of 1000 elements. This means that a tree of
depth 3 could contain up to a billion nodes (1000 3 ). Figure 9.5 shows a
3-level B-Tree. The diagram may look similar to a Dremel serving tree; the
principal difference is that the boxes here represent metadata on disk, not
independent services.
 
Search WWH ::




Custom Search