Databases Reference
In-Depth Information
or less than this tree node's values? If greater, go to the right, if less, go to the left.” It
continues in this manner until it either finds the value it's looking for or it sees that the
value it's looking for doesn't exist. If it finds the value, it then follows the pointer to
the actual document, loading that document's page into memory and then returning
it ( Figure 3-3 ).
Figure 3-3. An index entry contains the value for that index and a pointer to the document.
So, suppose we do a query that will end up matching a document or two in our col-
lection. If we do not use an index, we must load 64 million pages into memory from disk:
Pages of data: 256GB / (4KB/page) = 64 million pages
Suppose our index is about 80GB in size. Then the index is about 20 million pages in
size:
Number of pages in our index: 80GB / (4KB/page) = 20 million pages
However, the index is ordered, meaning that we don't have to go through every entry:
we only have to load certain nodes. How many?
Number of pages of the index that must be loaded into memory: ln (20,000,000) =
17 pages
From 64,000,000 down to 17!
OK, so it isn't exactly 17: once we've found the result in the index we need to load the
document from memory, so that's another size-of-document pages loaded, plus nodes
in the tree might be more than one page in size. Still, it's a tiny number of pages com-
pared with traversing the entire collection!
Hopefully you can now picture how indexes helps queries go faster.
Tip #23: Don't always use an index
Now that I have you reeling with the usefulness of indexes, let me warn you that they
should not be used for all queries. Suppose that, in the example above, instead of
fetching a few records we were returning about 90% of the document in the collection.
 
Search WWH ::




Custom Search