Java Reference
In-Depth Information
the message is dequeued and passed to the buggy handler, the transaction is rolled back. Since
the message is now back at the head of the queue, the handler is called over and over with
the same result. (This is sometimes called the poison message problem.) The message hand-
ling thread is not blocked, but it will never make progress either. This form of livelock often
comes from overeager error-recovery code that mistakenly treats an unrecoverable error as a
recoverable one.
Livelock can also occur when multiple cooperating threads change their state in response to
the others in such a way that no thread can ever make progress. This is similar to what hap-
pens when two overly polite people are walking in opposite directions in a hallway: each
steps out of the other's way, and now they are again in each other's way. So they both step
aside again, and again, and again. . .
The solution for this variety of livelock is to introduce some randomness into the retry mech-
anism. For example, when two stations in an ethernet network try to send a packet on the
shared carrier at the same time, the packets collide. The stations detect the collision, and each
tries to send their packet again later. If they each retry exactly one second later, they collide
over and over, and neither packet ever goes out, even if there is plenty of available bandwidth.
To avoid this, we make each wait an amount of time that includes a random component. (The
ethernet protocol also includes exponential backoff after repeated collisions, reducing both
congestion and the risk of repeated failure with multiple colliding stations.) Retrying with
random waits and backoffs can be equally effective for avoiding livelock in concurrent ap-
plications.
Summary
Liveness failures are a serious problem because there is no way to recover from them short
of aborting the application. The most common form of liveness failure is lock-ordering dead-
lock. Avoiding lock ordering deadlock starts at design time: ensure that when threads acquire
multiple locks, they do so in a consistent order. The best way to do this is by using open calls
throughout your program. This greatly reduces the number of places where multiple locks are
held at once, and makes it more obvious where those places are.
Search WWH ::




Custom Search