Monitoring - Practical Cassandra

Database Reference

In-Depth Information

Health Checks

Using JConsole to monitor your system is tedious and good as a monitoring sys-

tem only if you are actively staring at the graphs and information all the time.

Since that is unrealistic and time-consuming, we recommend that you use other

systems for monitoring the general health of your system such as Nagios.

Nagios

Nagios is open-source software dedicated to monitoring computers, networks,

hosts, and services and can alert you when things are going wrong or have been

resolved. It is extremely versatile and has the capability to monitor many types

of services, applications, or parts of an application. Let's start at the bottom of

the monitoring chain and work our way up. In order to avoid a complete lesson

on monitoring, we will only cover the basics along with what the most common

checks should be as they relate to Cassandra and its operation.

There are three primary alerts in Nagios: WARNING , CRITICAL , and OK . They

mean exactly what they sound like. A WARNING alert is sent if the service in ques-

tion is starting to show signs of a problem, such as a hard drive nearing capacity. A

CRITICAL alert is sent if the service in question is down or in a catastrophic state,

such as a hard drive that is completely out of space and preventing the applications

using that drive from running. An OK alert is sent when the service has recovered

or become available again, such as when the total space used on the hard drive has

dropped below the threshold set to alert for CRITICAL or WARNING .

OS and Hardware Checks

When monitoring any machine, it's best to start out with the checks at the OS and

hardware layer. Even if you are running Cassandra in a virtualized environment

such as Amazon or Rackspace, there are still hardware(ish) checks that should be

instituted.

Search WWH ::

Custom Search

Home