Java Reference
In-Depth Information
tended lock will almost always have at least one thread waiting to acquire it and so
will frequently appear in thread dumps.
If your application is keeping the CPUs sufficiently hot, you can use monitoring tools to infer
whether it would benefit from additional CPUs. A program with only four threads may be
able to keep a 4-way system fully utilized, but is unlikely to see a performance boost if moved
to an 8-way system since there would need to be waiting runnable threads to take advantage
of the additional processors. (You may also be able to reconfigure the program to divide its
workload over more threads, such as adjusting a thread pool size.) One of the columns repor-
ted by
vmstat
is the number of threads that are runnable but not currently running because
a CPU is not available; if CPU utilization is high and there are always runnable threads wait-
ing for a CPU, your application would probably benefit from more processors.
11.4.7. Just Say No to Object Pooling
formance has improved substantially since then. In fact, allocation in Java is now faster than
malloc
is in C: the common code path for
new Object
in HotSpot 1.4.x and 5.0 is ap-
proximately ten machine instructions.
To work around “slow” object lifecycles, many developers turned to object pooling, where
objects are recycled instead of being garbage collected and allocated anew when needed.
Even taking into account its reduced garbage collection overhead, object pooling has been
for light- and medium-weight objects) in single-threaded programs (
Click, 2005
)
.
In concurrent applications, pooling fares even worse. When threads allocate new objects,
very little inter-thread coordination is required, as allocators typically use thread-local alloc-
ation blocks to eliminate most synchronization on heap data structures. But if those threads
instead request an object from a pool, some synchronization is necessary to coordinate access
to the pool data structure, creating the possibility that a thread will block. Because blocking
a thread due to lock contention is hundreds of times more expensive than an allocation, even
a small amount of pool-induced contention would be a scalability bottleneck. (Even an un-
contended synchronization is usually more expensive than allocating an object.) This is yet
another technique intended as a performance optimization but that turned into a scalability
Allocating objects is usually cheaper than synchronizing.