Database Reference
In-Depth Information
Tall and narrow rows versus wide rows
This section refers to our second problem of records being too short.
If the record that you write for one time series set is too small, you run into cache-
hit problems and very large bloom filters. This is easy to understand. Hadoop reads
blocks from HDFS, which usually range in size between 64 MB to 256 MB. If your
data reads and writes are much smaller than this, you are bound to see inefficiencies
and hence the cache-hit problems. Bloom filters answer the question—based on
the key, is it possible that your data resides in the given region? The answer no is
definite, but the answer yes should be understood as maybe yes , necessitating a read
of the region and a search. If your keys are responsible for very thin rows with little
information, you will have too many bloom filter keys, which will both take the hard
drive space and reduce the efficiency using bloom filters in the first place.
What are the practical limits? Anything starting with the HDFS block size and
ending in tens of thousands of rows. Even though, theoretically, you can have
millions of columns, practical researcher Patrick McFadin reports that after tens
of thousands of columns, you start eating into 95 percentiles of your read latencies
and this is mostly due to deserialization costs on the larger indexes.
OpenTSDB principles
OpenTSDB is a set of tools that allow you to store and retrieve time series data.
It uses HBase for data storage and retrieval, but isolates you, the user, from
HBase completely. Thus, you don't have to know or care about HBase (other than
administer it). To the user, it is a very simple tool, which asks them to send the time
series data and then allows all kinds of displays.
We will use OpenTSDB, as promised, for two purposes:
• To teach you the use of the tool for your needs
• To elucidate on the design principles so that you can use them in your
own coding, if you cannot use OpenTSDB tools directly but need a similar
functionality in your application
 
Search WWH ::




Custom Search