Database Reference
In-Depth Information
Generating data for performance testing
There are many rules for testing, and I am offering you my set of rules here. My
ideas on this come from the now classic topic by Scott Meyers, Effective C++ , where
Scott gives you the three rules of performance optimization; that is, a) don't do it,
b) don't do it, c) and do it only after profiling.
The same applies to performance testing on big data. Never take anything for
granted. Dispute every conclusion and require it to be verified by testing. In our
case, the best way to confirm all the performance considerations we just stated
categorically is to generate a sufficient amount of data and perform the actual
tests ourselves.
As we already mentioned in Chapter 3 , Using HBase Tables for Single Entities ,
our lab code on the Web provides you with sufficient opportunity to do so.
Here are the instructions:
• Clone or download this project on GitHub ( https://github.com/
markkerzner/hbase-book ). It contains compiled JAR files, so all that
you need to run it is Java.
• In the generators lab, run this generator:
cd generators
./run_generate_users.sh 100 10
• Then, you can load the generated user data.
Tables for storing videos
You, the reader, need to add an entity table that will store many videos for a
unique user. Please close the topic and design this table.
If you have not stored the video in this table, you did the right thing. This shows
that you have been reading attentively and absorbing the material so far. Let's
recapitulate why you should not do it:
• A video is not a part of the videos table. It might be ready when the video
is not.
• If you combine the big and small data pieces, performance tuning will be
messed up and your debugging and optimization will be rendered useless.
• Besides, a video is usually stored in a content delivery system.
 
Search WWH ::




Custom Search