Storing and Processing Time Series Data - Time Series Databases

Database Reference

In-Depth Information

ROADMAP TO KEY IDEAS IN THIS CHAPTER

Although we've already mentioned some central aspects to handling time series data, the current

chapter goes into the most important ideas underlying methods to store and access time series in

more detail and more deeply than previously. Chapter 4 then provides tips for how best to imple-

ment these concepts using existing open source software. There's a lot to absorb in these two

chapters. So that you can better keep in mind how the key ideas fit together without getting lost in

the details, here's a brief roadmap of this chapter:

▪ Flat files

Limited utility for time series; data will outgrow them, and access is inefficient

▪ True database: relational (RDBMS)

Will not scale well; familiar star schema inappropriate

▪ True database: NoSQL non-relational database

Preferred because it scales well; efficient and rapid queries based on time range

— Basic design

▪ Unique row keys with time series IDs; column is a time offset

▪ Stores more than one time series

— Design choices

▪ Wide table stores data point-by-point

▪ Hybrid design mixes wide table and blob styles

▪ Direct blob insertion from memory cache

Now that we've walked through the main ideas, let's revisit them in some detail to explain

their significance.

Simplest Data Store: Flat Files

You can extend this very simple design a bit to something slightly more advanced by using a

more clever file format, such as the columnar file format Parquet, for organization. Parquet is

Search WWH ::

Custom Search

Home