Database Reference
In-Depth Information
There are a number of ways to see that the read capacity of your system isn't
keeping up. The first is to use nodetool cfstats to see how many SSTables
are in the ColumnFamily. If that number is continually increasing, your cluster's
I/O capacity isn't high enough to keep up with the write load. And because the
compactions aren't taking place (quickly enough) to group the necessary data to-
gether properly in the SSTables, the data is getting fragmented across the SST-
ables. The way to fix this is by adding more I/O capacity. This can be done by
either increasing the disk speed (with something like SSDs) or increasing the num-
ber of nodes in the cluster.
On the other hand, if the SSTable count is low, take a look at the file cache
on each machine as it compares to the read pattern. To calculate the amount of
file cache, you can use the formula of total_system_memory - JVM_heap_size. If
the amount of data is greater than that, and you have a roughly random read pat-
tern, then an equal ratio of reads to the cache-to-data ratio will need to seek to the
disk. In other words, you may be able to deal with some of the read issues by en-
abling key or row caches (by setting KEYS_ONLY , ROWS_ONLY , or ALL ). It is
also worth noting that if you set the cache to use row caching, ensure that the row
cache stays relatively small (about 20,000 rows); the key cache can be at 100%.
Freezing Nodes
You may run into a situation where the operating system is still responding nor-
mally, but Cassandra seems to be moving slowly. The first thing to check is wheth-
er garbage collection is running. In your Cassandra system.log you should look for
entries that reference GCInspector, indicating that either ParNew or the Concur-
rentMarkSweep collectors are taking a long time to run. You will likely see entries
that look somewhat similar to Listing 10.5 . These are entries pulled from a ma-
chine that is having GC issues. Notice that the total time spent in GC is high (ran-
ging from a few seconds up to a few minutes).
Listing 10.5 Example Log Entries for Long-Running GCs
Click here to view code image
INFO [ScheduledTasks:1] 2013-02-20 15:40:57,096
GCInspector.java (line 122) GC for
ParNew: 17305 ms for 1 collections, 2634113808
used; max is 7432306688
INFO [GC inspection] 2013-02-20 15:49:45,973 GCIn-
 
Search WWH ::




Custom Search