Database Reference
In-Depth Information
db.products.find({category_ids: null})
Here, when searching for all products lacking a category, the query optimizer will still
be able to use the index on
category_ids
to locate the corresponding products.
But there are two cases where a dense index is undesirable. The first is when you
want a unique index on a field that doesn't appear in every document in the collection.
For instance, you definitely want a unique index on every product's
sku
field. But sup-
pose that, for some reason, products are entered into the system before a sku is
assigned. If you have a unique index on
sku
and attempt to insert more than one prod-
uct without a sku, then the first insert will succeed, but all subsequent inserts will fail
because there will already be an entry in the index where
sku
is null. This is a case
where a dense index doesn't serve your purpose. What you want instead is a
sparse index
.
In a sparse index, only those documents having some value for the indexed key will
appear. If you want to create a sparse index, all you have to do is specify
{sparse:
true}
. So for example, you can create a unique, sparse index on
sku
like so:
db.products.ensureIndex({sku: 1}, {unique: true, sparse: true})
There's another case where a sparse index is desirable: when a large number of docu-
ments in a collection don't contain the indexed key. For example, suppose you
allowed anonymous reviews on your e-commerce site. In this case, half the reviews
might lack a
user_id
field, and if that field were indexed, then half the entries in that
index would be null. This would be inefficient for two reasons. First, it would increase
the size of the index. Second, it would require updates to the index when adding and
removing documents with null
user_id
fields.
If you rarely (or never) expect queries on anonymous reviews, you might elect to
build a sparse index on
user_id
. Here again, setting the
sparse
option is simple:
db.reviews.ensureIndex({user_id: 1}, {sparse: true})
Now, only those reviews linked to a user via the
user_id
field will be indexed.
M
ULTIKEY
INDEXES
You've seen in earlier chapters several examples of indexing fields whose values are
arrays.
9
This is made possible by what's known as a
multikey index
, which allows multi-
ple entries in the index to reference the same document. This makes sense if we take
a simple example. Suppose you have a product document with a few tags like this:
{ name: "Wheelbarrow",
tags: ["tools", "gardening", "soil"]
}
If you create an index on
tags
, then each value in this document's tags array will
appear in the index. This means that a query on any one of these array values can use
the index to locate the document. This is the idea behind a multikey index: multiple
index entries, or keys, end up referencing the same document.
9
Think of category IDs, for instance.