Information Technology Reference
In-Depth Information
thateverytimethispageisgenerated,alargeamountofdatamustbesorted.Bypre-sorting
that data, we can reduce the load time of that page. The other hump might be from a web
page that requires a database lookup that is larger than the cache and therefore performs
badly. We can adjust the cache so as to improve performance of that page.
Therearemanyotherwaystovisualizedata.Moresophisticatedvisualizationscandraw
outdifferenttrendsorconsolidatemoreinformationintoapicture.Colorcanbeusedtoadd
anadditional dimension.Thereare many fine topics on this subject.Wehighlyrecommend
The Visual Display of Quantitative Information by Tufte ( 1986 ) .
17.6 Storage
The storage system holds the metrics collected and makes them accessible by the other
modules.
Storage is one of the most architecturally demanding parts of the monitoring system.
Newmetrics arriveinaconstant flood,athighspeed.Alerting requiresfast,real-time, read
accessforrecentdata.Atthesametime,otheranalysisrequiresiterationsoverlargeranges,
which makes caching difficult. Typical SQL databases are bad at all of these things.
Medium-sized systems often collect 25-200 metrics for each server. There are multiple
servers on each machine. A medium-sized system may need to store 400 new metrics each
second. Larger systems typically store thousands of metrics per second. Globally distrib-
uted systems may store tens or hundreds of thousands of metrics every second.
As a result, many storage systems handle either real-time data or long-term storage but
notboth,orcan'tdobothatlargescale.Recentlyanumberoftime-seriesdatabasessuchas
OpenTSDB have sprung up that are specifically designed to be good at both real-time and
long-term storage. They achieve this by keeping recent data in RAM, often up to an hour's
worth, as well as by using a highly tuned storage format.
On-disk storage of time-series data is usually done one of two ways. One method
achieves fast high-speed random access by using fixed-size records. For example, each
metric might be stored in a 20-byte record. The system can efficiently find a metric at a
particular time by using a modified binary search. This approach is most effective when
real-time visualization is required. Another method is to compress the data, taking advant-
age of the fact that deltas can be stored in very few bits. The result is often variable-length
records,whichmeansthetime-seriesdatamustbereadfromthestarttofindthemetricata
particular time. These systems permit much greater storage density. Some systems achieve
a balance by storing fixed-size records on a file system with built-in compression.
Search WWH ::




Custom Search