Database Reference
In-Depth Information
So, with an address such as patrick@datastax.com , you will create this insert:
insert into email_index (domain,username,user_id) values
('datastax.com', 'patrick','pmcfadin');
select user_id from email_index where domain='datastax.com' and
username='patrick';
On large domains, such as gmail.com , you might want to expand the domain part of
the include part of the username. It will give you more manageable partitions.
We saw what Patrick does. The essential points here are:
• Create another table to answer your queries. Nothing is available in NoSQL
for efficient secondary indices; so do it yourself.
• Denormalize. That is really the key. If you need to, store the same data
twice or more times, as long as you can get efficient queries to get your
answers fast. It is true that your code needs to take care of updating the
same information in other tables as well, yet the performance benefits
are paramount.
• Think about efficiency when you are designing.
Why do we divide the e-mail IDs by domains? The answer is efficiency. Notice that we
don't want to have too small or too large partitions, but we want them balanced. Is this
for performance reasons? What are the arguments against tall and narrow rows?
These are called index tables or lookup tables. This is a perfectly valid way of building
up fast data models and are used pretty heavily in the production environment.
The number of cells is a performance consideration. Even though theoretically you
can store 2 billion cells per row, there are trade-offs in the speed at which you can
access them. Patrick found that tens of thousands of cells is an upper limit before you
start eating into 95 percentiles of your read latencies—mostly due to deserialization
costs on the larger indexes.
So, just to make sure, we have a question—why not have absolutely thin and tall
tables with one value per key? Will this be inefficient?
The answer is that it is mostly inefficient. Lots of rows can cause cache hit problems
and huge bloom filters.
 
Search WWH ::




Custom Search