Database Reference
In-Depth Information
DynamoDB versus Cassandra
Let's start with a data model such as DynamoDB's storage model, which is very similar to
Cassandra's model, in which data is hashed on the row key, and the data inside the key is
ordered by a specific column of the insert. In DynamoDB, a column can be single valued or
scalar, which means that attributes can be multivalued. Cassandra has various attribute
types, such as Integer , BigInteger , ASCII , UTF8 , and Double , and it also offers
composite and dynamic composite columns. It provide the full range of data formats,
which include structured, semi-structured and unstructured data that can be run on recent
applications, whereas DynamoDB has only two attribute types, namely String and Num-
ber .
Multi-datacenter across all regions is supported by Cassandra, whereas DynamoDB replic-
ates data across multiple availability zones in the same region, but cross-region is not sup-
ported. So if we want to provide local data latencies in any regions across the world, then
Cassandra provides full control over data consistency.
Let's take a scenario in which we require a large number of increments with a few counters,
with the ability to read the current counter. Scaling the throughput on an individual counter
is quite difficult, because there is a direct read/write operation performed. So if we need
more than one node to handle one count, then the read operation becomes slow, and it in-
volves all the nodes. In this case, we retry this operation in the event, because we don't
know whether our previous request succeeded. We are performing the same update twice
so that it frequently causes a long latency or load spikes across the cluster. In DynamoDB,
there is an atomic counter that is more reliable with low latency, and it supports as many in-
crements as the operations we have performed.
In DynamoDB the overload is effectively handled. If we exceed the predicted/mentioned
throughput, we rapidly get the ThroughputExceeded error; meanwhile, no other re-
quests are affected. This is very useful for a heavily-loaded site where thousands of re-
quests come at a time, and, because of the latency spikes, queues will be generated with
great thrust.
In Cassandra, the virtual node scale-up is pretty easy, but the scale-down operation remains
slow, manual, and error prone. In data streaming, the transmitting nodes are joining or leav-
ing the rings. This causes a group failure of nodes, so it requires repairing. Also, data lost
during the decommissioning operation requires data restore through a backup. In Dy-
namoDB, scale-up becomes effortless with a single-line command that waits for a while to
Search WWH ::




Custom Search