Database Reference
In-Depth Information
innovative technologies include Apache Cassandra and a variety of distributions of Apache
Hadoop. They share the desirable characteristic of being able to scale efficiently and of being
able to use less-structured data than traditional database systems. Time series data could be
stored as flat files, but if you will primarily want to access the data based on a time span,
storing it as a time series database is likely a good choice. A TSDB is optimized for best per-
formance for queries based on a range of time. New NoSQL approaches make use of non-re-
lational databases with considerable advantages in flexibility and performance over tradition-
al relational databases (RDBMS) for this purpose. See for a general comparison of NoSQL
databases with relational databases.
For the methods described in this topic we recommend the Hadoop-based databases Apache
HBase or MapR-DB. The latter is a non-relational database integrated directly into the file
system of the MapR distribution derived from Apache Hadoop. The reason we focus on
these Hadoop-based solutions is that they can not only execute rapid ingestion of time series
data, but they also support rapid, efficient queries of time series databases. For the rest of this
book, you should assume that whenever we say “time series database” without being more
specific, we are referring to these NoSQL Hadoop-based database solutions augmented with
technologies to make them work well with time series data.
 
Search WWH ::




Custom Search