Database Reference
In-Depth Information
Introducing concurrency
While this approach works fine in isolation, we'll run into a problem if more than one pro-
cess needs to concurrently update the
starred_by_users
column in the same row.
Concretely, if two users star a status update at roughly the same time, we'll run into a prob-
lem. Consider the following sequence of events:
1. Both
carol
and
dave
want to star
alice
's status update.
2. Each of their requests is handled by a separate, concurrent process within the
MyStatus application.
3. The process of the character
carol
receives the request and begins by reading the
current value of the
starred_by_users
column, retrieving
["bob"]
.
4. The process of the character
dave
receives its request, and also reads the current
value of
starred_by_users
, which is still
["bob"]
.
5. The process of the character
carol
adds her name to the list and writes it back to
the row, setting it to
["bob", "carol"]
.
6. The process of the character
dave
adds his name to the list that it read, oblivious
to the fact that it has now changed in the database, writing
["bob", "dave"]
to the row.
7. The character
carol
has now lost the star of the status update.
This is a similar scenario to the one we explored in
Chapter 7
,
Expanding Your Data Model
with concurrent updates to the
users
table and, indeed, optimistic locking would be a vi-
able solution. By introducing a version column to the
user_status_updates
table
and performing conditional updates, we could ensure that the above scenario did not result
in data loss. In particular, step 6 would result in a conditional update not applied. The pro-
cess of the character
dave
could then reread the updated value of the
starred_by_users
list, append the username of
dave
to the list, and retry the up-
date.
This solution works but it's not completely satisfying. In particular, it requires at least one
read operation from the database before the row can be updated; in the case of a version
conflict, additional reads are required. As we discussed in the previous chapter, conditional
writes do carry a performance cost. Finally, this approach adds additional complexity to our
application: we need to have code that handles the case of a stale version and retries the op-
eration.