Performance and Scalability - Java Concurrency in Practice

Java Reference

In-Depth Information

tended lock will almost always have at least one thread waiting to acquire it and so

will frequently appear in thread dumps.

If your application is keeping the CPUs sufficiently hot, you can use monitoring tools to infer

whether it would benefit from additional CPUs. A program with only four threads may be

able to keep a 4-way system fully utilized, but is unlikely to see a performance boost if moved

to an 8-way system since there would need to be waiting runnable threads to take advantage

of the additional processors. (You may also be able to reconfigure the program to divide its

workload over more threads, such as adjusting a thread pool size.) One of the columns repor-

ted by vmstat is the number of threads that are runnable but not currently running because

a CPU is not available; if CPU utilization is high and there are always runnable threads wait-

ing for a CPU, your application would probably benefit from more processors.

11.4.7. Just Say No to Object Pooling

In early JVM versions, object allocation and garbage collection were slow, [13] but their per-

formance has improved substantially since then. In fact, allocation in Java is now faster than

malloc is in C: the common code path for new Object in HotSpot 1.4.x and 5.0 is ap-

proximately ten machine instructions.

To work around “slow” object lifecycles, many developers turned to object pooling, where

objects are recycled instead of being garbage collected and allocated anew when needed.

Even taking into account its reduced garbage collection overhead, object pooling has been

shown to be a performance loss [14] for all but the most expensive objects (and a serious loss

for light- and medium-weight objects) in single-threaded programs ( Click, 2005 ) .

In concurrent applications, pooling fares even worse. When threads allocate new objects,

very little inter-thread coordination is required, as allocators typically use thread-local alloc-

ation blocks to eliminate most synchronization on heap data structures. But if those threads

instead request an object from a pool, some synchronization is necessary to coordinate access

to the pool data structure, creating the possibility that a thread will block. Because blocking

a thread due to lock contention is hundreds of times more expensive than an allocation, even

a small amount of pool-induced contention would be a scalability bottleneck. (Even an un-

contended synchronization is usually more expensive than allocating an object.) This is yet

another technique intended as a performance optimization but that turned into a scalability

hazard. Pooling has its uses, [15] but is of limited utility as a performance optimization.

Allocating objects is usually cheaper than synchronizing.

Search WWH ::

Custom Search

Home