Database Reference
In-Depth Information
ROADMAP TO KEY IDEAS IN THIS CHAPTER
Although we've already mentioned some central aspects to handling time series data, the current
chapter goes into the most important ideas underlying methods to store and access time series in
more detail and more deeply than previously. Chapter 4 then provides tips for how best to imple-
ment these concepts using existing open source software. There's a lot to absorb in these two
chapters. So that you can better keep in mind how the key ideas fit together without getting lost in
the details, here's a brief roadmap of this chapter:
▪ Flat files
Limited utility for time series; data will outgrow them, and access is inefficient
▪ True database: relational (RDBMS)
Will not scale well; familiar star schema inappropriate
▪ True database: NoSQL non-relational database
Preferred because it scales well; efficient and rapid queries based on time range
— Basic design
▪ Unique row keys with time series IDs; column is a time offset
▪ Stores more than one time series
— Design choices
▪ Wide table stores data point-by-point
▪ Hybrid design mixes wide table and blob styles
▪ Direct blob insertion from memory cache
Now that we've walked through the main ideas, let's revisit them in some detail to explain
their significance.
Simplest Data Store: Flat Files
You can extend this very simple design a bit to something slightly more advanced by using a
more clever file format, such as the columnar file format Parquet, for organization. Parquet is
Search WWH ::




Custom Search