Database Reference
In-Depth Information
Handling conflicting data
As we explored above, Cassandra's masterless replication can lead to situations in which
multiple versions of the same record exist on different nodes. Since there is no master node
containing the canonical copy of a record, Cassandra must use other means to determine
which version of the data is correct.
This situation comes into play when reading data at any consistency level other than ONE .
When our application requests a row from Cassandra, we will receive a response with that
row's data; each column will contain one value. However, if we're reading at a consistency
level such as QUORUM or ALL , Cassandra internally will fetch the copies of the data from
multiple nodes; it's possible that the different copies will contain conflicting data. It's up to
Cassandra to figure out exactly what to return to us.
The problem is most acute when different clients are writing the same piece of data concur-
rently. Let's return to a scenario we explored in Chapter 7 , Expanding Your Data Model :
two employees of HappyCorp, Heather and Charles, are simultaneously attempting to up-
date the location field in the user record of HappyCorp's shared account. Let's suppose
that we are writing data at consistency level ONE . This concurrent operation could be car-
ried out via the following sequence of events:
1. Heather updates the location to New York. The update is acknowledged by
Replica 1.
2. Charles updates the location to Palo Alto. The update is acknowledged by Rep-
lica 2.
Just after Heather and Charles's concurrent updates, the Replica 1 copy of the HappyCorp
user record will contain New York in its location field, and the Replica 2 copy will con-
tain Palo Alto. Now, before the updates have a chance to propagate to any nodes except the
ones that respectively acknowledged them, let's read the data back at the ALL consistency.
When Cassandra receives the read request, it will fetch HappyCorp's user record from Rep-
licas 1, 2, and 3. Each replica will contain a different version of the record: Replica 1's
copy has New York in the location field, Replica 2's has Palo Alto, and Replica 3's does
not contain anything in that field. So what location will Cassandra actually return to us?
Search WWH ::




Custom Search