Database Reference
In-Depth Information
example, clusters running in Amazon's EC2 have two snitches: EC2Snitch
and EC2MultiRegionSnitch . The former is designed to operate well in
a situation where the cluster is spread across multiple availability zones in
a single region (the most common case), whereas the latter is designed to
optimize across regions.
Because each node has, through gossip and snitch, a complete
understanding of the cluster topology, any node can act as a query server.
When a client connects to the cluster it can chose any node it likes, which
then becomes the coordinator for that client until the connection is closed.
This would be roughly equivalent to every MongoDB server hosting mongod
in addition to mongos .
Setting Up a Cluster
Setting up a Cassandra cluster is relatively easy from a software perspective
because each server is essentially identical. For good performance, the
recommendation is to use multi-core machines with 8GB to 16GB of RAM.
Very large heaps in Cassandra actually result in reduced performance due
to the need to perform garbage collection. The usual recommendation is
somewhere around an 8GB heap (depending on available RAM). In Amazon
EC2, this corresponds to Large or Extra Large instances.
A Cassandra server should have as fast a disk as possible. If available, Solid
State Drives (SSDs) are a good match to Cassandra's access pattern. If SSDs
are not available, a group of disks merged via RAID0 is also a good choice.
Cassandra, like Kafka and HDFS, has the ability to use several disks in a
JBOD (Just a Bunch of Disks) configuration, but RAID0 can achieve higher
performance. Using a RAID0 configuration also eliminates imbalances in
the data distribution between drives.
Network attached storage (NAS) is not recommended for Cassandra
installations, except as a backup medium. Network storage systems such
as NFS or Elastic Block Storage (EBS) contend for network I/O resources,
resulting in degraded performance.
Setting up the server itself is straightforward. Installation instructions for
Debian are available at the Apache Cassandra website
( http://cassandra.apache.org ) . Datastax, a commercial provider of
Cassandra and active committer to the code, also provides distributions for
a number of platforms including Windows and OS X. For users wanting
Search WWH ::




Custom Search