Information Technology Reference
In-Depth Information
tract meaning from the raw data—for example, detecting problems such as a server being
down or anomalies such as a metric for one system being significantly dissimilar from all
the others. The alerting and escalation system communicates these conditions to interes-
ted parties. Visualization systems display the data for human interpretation. Each of these
components is told how to do its job via information from the configuration base.
Over the years there have been many monitoring products, both commercial and open
source. It seems like each one has been very good at two or three components, but left
much to be desired with the others. Some systems have done many components well, but
none has done them all well.
Interfaces between the components are starting to become standardized in ways that let
us mix and match components. This enables faster innovation as new systems can be cre-
ated without having to reinvent all the parts.
What follows is a deeper discussion of each component, its purpose, and features that
we've found to be the most useful.
17.1 Sensing and Measurement
Thesensingandmeasurementcomponentgathersthemeasurements.Measurementscanbe
categorizedasblackboxorwhitebox,dependingontheamountofknowledgeoftheintern-
als that is used. Measurements can be direct or synthesized, depending on whether each
item is separately counted, or totals are periodically retrieved and averaged. We can mon-
itor the rate at which a particular operation is happening, or the capability of the system to
allow that operation to happen. Systems may be instrumented to provide gauges, such as
percentage CPU utilization, or counters, such as the number of times that something has
occurred.
Let's look at each of those in more detail.
17.1.1 Blackbox versus Whitebox Monitoring
Blackbox monitoringmeansthatmeasurementstrytoemulateauser.Theytreatthesystem
as a blackbox, whose contents are unknown. Users do not know how a system's internals
work and can only examine the external properties of the system. They may guess about
the internals, but they cannot be sure. In other words, these measurements are done at a
high level of abstraction.
DoinganHTTPSGETofawebsite'smainpageisanexampleofblackboxmonitoring.
The measurement is unaware of any load balancing infrastructure, internal server architec-
ture, or which technologies are in use. Nevertheless, from this one measurement, we can
determinemanythings:whetherthesiteisup,howfastisitresponding,iftheSSLcertific-
ate has expired, if the system is producing a valid HTML document, and so on. Blackbox
Search WWH ::




Custom Search