Understanding Query Execution - Google BigQuery Analytics

Database Reference

In-Depth Information

Figure 9.5 A small B-Tree

B-Trees take advantage of the fact that disk I/O is much slower than

memory access. A single disk read operation can read an entire B-Tree node

into memory, after which time the contents of node can be scanned quickly.

A 3-level B-Tree requires at most three disk reads to find any element. The

processor has to do more work, scanning three entire nodes (which may be

composed of thousands of elements), but the time spent by the CPU is much

smaller than the disk time.

Filesystem cache behavior can also help; the root node of the B-Tree will

almost always be cached in memory. The second level nodes also have a high

cache hit rate because there are only 1000 of them. This means that most

searches have to do only a single disk seek and read to find the element they

are looking for.

B-trees are also convenient data structures for inserting and deleting data.

If you add a new node, most of the time you can do it by just adding the

relevant pointer from the parent node. Only one out of every 1000 times

will you have to rebalance. Of course, rebalancing can be tricky, but there

have been decades of database improvements to work on perfecting this

operation. A delete operation is even easier; just delete the pointer to the

data. When an entire node is empty, you can reclaim the storage.

When doing analytics queries, you're not usually just looking up

data—you're often scanning for data within certain parameters. B-trees

make these types of scans easy; because they are stored in sorted order,

Search WWH ::

Custom Search

Home