Installing the Tools for Tuning (JBoss AS 5) Part 5

How to create a stateful Test Plan

By default, when you queue up several HTTP Requests, every single request will be considered as stateless, this means that a new HTTP Session will be created for every request. Just like if you open a browser window, issue the request and close the browser. If you want every user’s request to hold Session data, you can simply instruct JMeter to use cookies to persist the HTTP Session. From the menu select Add | Config Element | Http Cookie Manager. The Http Cookie Manager needs to be added just below the Thread Group element, so that it will be shared among all HTTP Requests.

Running JMeter as a shell

Your hardware’s capabilities will inevitably limit the number of threads you can effectively run with JMeter. If you need to set up a large scale test and you cannot afford to execute the JMeter GUI, you can consider launching JMeter using a command line.

You can use the following parameters in order to run JMeter from the command prompt:

• -n: This specifies JMeter is to run in non-GUI mode.

• -t: Name of JMX file that contains the Test Plan

• -l: Name of JTL file to log sample results to

• -r: Run all remote servers specified in JMeter properties.

• -H: Proxy server hostname or IP address, if run using firewall/proxy


• -P: Proxy server port, if run using firewall/proxy

For example:

tmp39-44_thumb[2][2][2]_thumb

JMeter will then execute the test plan contained in the test1.jmx file, logging the sample results to the file logfile.jtl.

Operating system tools and commands

The last (but not the least) element which influences the performance of the application is the kind of hardware on which it is running. If you want to check how much a single hardware resource is used by your application, you can use some common tools and utilities which are usually installed on your machine. Generally speaking, Unix systems have a greater set of built-in commands which are available to monitor your host. Windows users can, however, get an adequate amount of utilities with a simple search on the net and we will guide the reader to some good choices.

Windows users

One of the most useful applications is the Performance Monitor, which can be used to track a wide range of attributes of your machine and give it a real time graphical display of results. The performance monitor can be run by opening the control panel and clicking Performance and Maintenance | Administrative Tools | Performance (Windows XP).

tmp39-45_thumb[1]

There you can add new counters on your graph by selecting the (+) button on the toolbar. As you look at the output, you can see that the lines on the graph correspond to the counters that you’ve installed. Once you have hunted which are the system bottlenecks of your machine, you can further restrict your analysis by looking at the Task Manager application.

From the Task Manager application, first select the Processes tab to view the list of processes that are running on your machine. Next, choose the Select Columns command from the View menu. You’ll now see a list of all of the resources that you can monitor through the Task Manager.

For example, the following image illustrates a monitoring session which is observing the system input-output:

tmp39-46

The Windows Performance Monitor and the Task Manager are functional but basic tools for keeping an eye on what your computer’s up to. If you want to go beyond the built-in tools and for more in-depth information and control, check the following alternatives:

Tool

URL

Process Explorer

http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx

System Explorer

http://systemexplorer.mistergroup.org/

Manage Engine

http://www.manageengine.com (Commercial)

Windows monitor

Unix users

Unix and Linux users have a great number of tools available for monitoring the basic system activities, usually available as a shell executable command.

The most popular utility is top, which provides an ongoing look at processor activity in real time. It displays a listing of the most CPU-intensive tasks on the system and provides an interactive interface for manipulating processes. It can sort the tasks by CPU usage, memory usage, and runtime. See the following example:

tmp39-47_thumb[2][2][2][2]

The upper highlighted lines describe the statistics about the machine, including CPU and Memory real time data. In the lower section, you can read per-process information: for example, you can understand that there’s one java process, run by the user jboss, which occupies 103 MB of resident set size memory (the non-swapped physical memory that a process uses) and is using about 20% of the CPU time.

Topping for Solaris

Solaris users can use the prstat utility instead of top which can be used mostly the same way as top to provide views of a system’s activity and resource consumption.Another useful tool, available in most Unix flavors is vmstat. As its name suggests, this utility reports virtual memory statistics. It shows how much virtual memory there is, CPU, and paging activity. This is extremely useful.

To monitor the virtual memory activity on your system, it’s best to use vmstat with a delay. A delay is the number of seconds between updates. (If you don’t supply a delay, vmstat just reports the averages since the last boot). Five seconds is the recommended delay interval.

To run vmstat with a five-second delay, type:

tmp39-48_thumb[2][2][2][2]

Here’s an example of system activity:

tmp39-49_thumb[2][2][2][2] 

The output from this command is divided into five sections:

• procs: The number of processes waiting for run time (r), those in uninterruptable sleep (b) and the number of processes swapped out but otherwise runnable(w).

• memory: The amount of virtual memory used (swpd). The amount of idle memory (free). The amount of memory used as buffers (buff). Figures in KB.

• swap and io: The amount of memory swapped in (si) and to disk (so). The blocks sent (bo) and received from a block device (bi).

• system: The number of interrupts per second, including the clock (in) and the number of context switches per second (cs).

• cpu: The percentage of CPU usage among user time (us), system time (sy), and idle time (id).

These instruments provide an invaluable source of information to discover bottlenecks in your system caused by your applications or by your hardware. However, once you have evidence of a problem with your system, how do you apply the necessary corrections?

Dealing with low CPU utilization

Having too much idle time on your CPU is not necessarily a good thing. In particular, you should be suspicious if you have a high idle time in conjunction with the following symptoms:

• High idle time across all CPUs with a no unusual input/output or network activity.

• High idle time, which does not decrease with increased load.

• Response times degrade too rapidly with increased load.

If you are experiencing any of these symptoms it’s likely that your application server is waiting for some resources to be freed. A fundamental instrument to find the source of the problem is the application server’s Thread Dump, which can be obtained from the VisualVM monitor tab or by means of JBoss AS jmx-console.

For example, supposing that you have many Threads with the following stack trace:

tmp39-50_thumb[2][2][2][2]

Then, it’s likely that you have a deadlock in your application caused by some of your threads. Definitive proof can be obtained by means of the jstack [pid] command line utility, which is part of the Java Development Kit distribution. Jstat will provide a full stack trace for the application along with a diagnostic about deadlocks.

Dealing with high CPU utilization

One of the most pervasive myths among IT technicians is that a high CPU usage is an obvious indicator of a system bottleneck. The following extract from vmstat output is a clear example:

tmp39-51_thumb[2][2][2][2]

Here the CPU is busy at 90% (45% user + 45% system) but there’s no CPU bottleneck in this machine: simply the machine is working at full potential. As a matter of fact UNIX internal dispatchers are designed to keep the CPUs as busy as possible. This maximizes task throughput, even if it can be misleading for a neophyte. Even a 100% CPU usage is not generally a problem and can rather indicate an optimal state.

The only cause for a concern is when the run queue (r value under the procs column) exceeds the number of CPUs on the server.

tmp39-52_thumb[2][2][2][2]

In the preceding example, taken from a six CPU machine, it’s clear that there’s a CPU constraint because of the high run queue. As a next step, examine the cpu column to understand how the machine consumes CPU time. In this sample, having a high system time (65%), you are obviously performing a large amount of system calls.

This can happen if you are executing lots of input/output, socket or timestamp creation. You should find out, along with your performance tools, the modules that cause excessive or inefficient input/output. One potential candidate is, for example, a class, which perform lots of unbuffered input/output. Replacing it with a buffered one could greatly reduce the problem.

A special case is when only one or a few CPU experience a peak of usage. This scenario is usually caused by the fact that your system uses a single thread to manage some resources. Your checklist should include, at first, garbage collection configuration. If garbage collection is correctly configured, you should then verify if you have any contention for getting access to some resources. We will see this in a minute.

Dealing with high resource contention

One kind of issue, related to abnormal CPU utilization (high and low), happens when you have a single shared resource (think of an Object cache for example), which is shared among many users.

You can have a proof of it by using the shell command mpstat, which indicates your thread’s spin on mutex values. In short this is s a measure for kernel contention (if a thread can’t acquire a lock, it spins). Here’s a sample of output for a machine with a high spin on mutex value:

tmp39-53_thumb[2][2][2][2]

The smtx measurement shows the number of times a CPU failed to obtain a mutex immediately. Depending upon CPU speed, a reading of more than 500 may be an indication of a system in trouble. If the smtx is greater than 500 on a single CPU and sys dominates usr (that is, system time is larger than user time, and system time is greater than 20%), it is likely that mutex contention is occurring.

You should further inspect through a Thread Dump what is the source of the problem. For example, in such a dump you have definitive evidence that your threads are locked waiting on a Queue:

tmp39-54_thumb[2][2][2][2]

In order to mitigate this effect, you should introduce additional shared resources; for example, you might distribute your cache through a larger set of JVMs.

Dealing with high disk utilization

Excessive disk utilization is a frequent bottleneck for Enterprise applications. The command iostat is commonly used by system administrators to detect input/output statistics. Here’s an example:

tmp39-55_thumb[2][2][2][2]

Here, the first two columns indicate the KB read per second (kr/s) and KB writes per second (kw/s). The average service time is indicated by svc_t. The %w indicates the percent of time there are transactions waiting for service while the %b is percent of time the disk is busy.

You should pay attention if you find excessive values for these values:

• High service time (svc_t) (generally above 30s of ms).

• High %b (above 5).

• A consistently high reads/writes values.

You should evidently find out which is the module that is causing the excessive disk utilization. Some possible causes for a Java Enterprise application are:

• Excessive logging.

• Stateful Session Bean Passivation.

• Poorly configured database cache.

If the bottleneck is not caused by your application, then you should consider spreading the file system of the disk on to two or more disks. As an alternative, move the file system to another faster disk/controller or replace the existing disk/controller with a faster one.

Summary

In this topic, we have introduced some of the most popular tools available in the market for monitoring the JVM, the application server, its application deployed, and the operating system. These are the essential points covered:

• VisualVM is a monitoring tool developed by Sun, which can be used to analyze JVM heap data, track down memory leaks, monitor the garbage collector, and perform memory and CPU profiling.

• The Eclipse TPTP Platform covers the entire performance lifecycle so it can be used also as an all-in-one solution for your projects. In this topic, we have learnt how to use it to profile and test your server applications.

• JMeter is a well-known application, which can be used to set up benchmarks of your web applications but can be equipped for a variety of tests as well.

• Each operating system has built-in tools, which can monitor your hardware resources.

• Windows users can opt for the Performance Monitor and the Task Manager as well as many freely available utilities.

° Unix/Linux users have a great number of tools available, the most popular being the shell commands top and vmstat.

In the next topic, we will review the first area of tuning introduced, that is the Java Virtual Machine, showing some concrete application use cases.

Next post:

Previous post: