Performance Tuning Concepts (JBoss AS 5) Part 2

Building the performance test

You are now aware that performance tuning is an iterative process which continues until the software has met your goals in terms of Response Time and Throughput. Let’s see more in detail how to proceed with every single step of the process:

Establish a baseline

The first part of performance tuning consists of building up a baseline. In practice you need to figure out the conditions under which the application will perform. The more you understand exactly how your application will be used, the more successful your performance tuning will be. If you have invested some days in an accurate analysis you should have already got the basis upon which you will develop your performance objectives which are usually measured in terms of response times, throughput (requests per second), and resource utilization level.

Plan for average users or for peak?

There are many types of statistics that can be useful when you are building a baseline, however one of your goals should be to develop a profile of your application’s workload with special attention to the peaks. For example, many business applications experience daily or monthly peaks depending on a variety of factors. This is especially true for organizations like travel agencies or airline companies which expect great differences in workload in different periods of the year. In this kind of scenario, it doesn’t make sense to set up a baseline on the average number of users: you have no choice but to use the worst case; that is the peak of users.

Collect data

In order to collect data, all applications should be instrumented to provide information for performance analysis. This can be broken down in a set of activities:

• Set up your application server with the same settings and hardware as the production environment and produce a replica of database/naming directories if you can’t use the production legacy systems for testing.

• Isolate the testing environment so that you don’t skew those tests by involving network traffic that doesn’t belong in your tests.

• Install the appropriate tools, which will start the load test and the counterpart software that collect data from the benchmark. The next topic will point you towards some great resources which can be used to start a session of performance tuning.

How long should data collection last?

If you surf the net you can find plenty of benchmarks affirming that X is faster than Y. Even if micro benchmarks are useful to quickly calculate the response of a single variable (for example, the time to execute a stored procedure), they are of little or no use for testing complex systems. Why? Because many factors in enterprise systems produce their effects after the system has been tested extensively: think about caching systems or JVM garbage collection tuning as a clue.

Investing a huge amount of time for your tuning session is, however, not realistic as you will likely fail to meet your budget goals, so your performance tests should be completed by a fixed timeline.

Balancing these two factors, we could say that a good performance tuning session should last at least 20-30 minutes (besides warm-up activities, if any) for bread-and-butter applications like the sample Pet Store demo application (http: / /java. sun. com/developer/releases/petstore/). Larger applications, on the other hand, require more functionality to test and engage a considerable amount of system resources. A complete test plan can demand, in this case, some hours or even days to be completed. As a matter of fact, some dynamics (like the garbage collector) can take time to unfold its effects; benchmarking these kinds of applications on a short-time basis can thus be useless or even misleading.

Luckily you can organize your time in such a way that the tuning sessions are planned carefully during the day and then executed with batch scripts at night.

Analyze data

With the amount of data collected, you have evidence of which areas show a performance penalty: keep in mind, however, that this might just be the symptom of a problem which arises in a different area of your application. Technically speaking the analysis procedure can be split into the following activities:

1. Identify the locations of any bottlenecks.

2. Think of a hypothesis which could be the cause of the bottleneck.

3. Consider any factors that may prove/disprove your hypothesis.

At the end of these activities, you should be ready to create a new test which isolates the factor that we suppose to be the cause of the bottleneck.

For example, supposing you are in the middle of a tuning session of an enterprise application. You have identified (Step 1) that the application occasionally pauses and cannot complete all transactions within the strict timeout setting.

Your hypothesis (Step 2) is that the garbage collector configuration needs to be changed because it’s likely that there are too many full cycles of garbage collection.

As a proof of your hypothesis (Step 3) you are going to add in the configuration a switch that prints the details of each garbage collection.

In definitive, by carefully examining performance indicators, you can correctly isolate the problem and thus identify the main problems, which must be addressed first. If the data you collect is not complete, then your analysis is likely to be inaccurate and you might need to retest and collect the missing information or use further analysis tools.

Configure and test again

When your analysis has terminated you should have a list of indicators that need testing: you should first establish a priority list so that you can first address those issues that are likely to provide the maximum payoff.

It’s important to stress that you must apply each change individually otherwise you can distort the results and make it difficult to identify potential new performance issues.

And that’s it! Get your instruments ready and launch another session of performance testing. You can stop adjusting and measuring when you believe you’re close enough to the response times to satisfy your requirements.

As a side note consider that optimizing code can introduce new bugs so the application should be tested during the optimization phase. A particular optimization should not be considered valid until the application using that optimization’s code path has passed quality assessment.

Tuning Java Enterprise applications

One of the most pervasive myths about Java Enterprise applications is that they simply are slow. The notion of Java being "slow" in popular discussions is often poorly calibrated but, unfortunately, widely believed. The most compelling reason for this sentiment dates back to the first releases of Java Development Kit. In 1995, Java was much slower as the first implementations of the Java Virtual Machine didn’t have a Just In Time complier, the garbage collector algorithms were not so refined and, generally speaking, lots of applications were written using classes with poor performance numbers (for example, Input/Output streams without buffering, or abuse of thread-safe collections classes like the java.io.Vector).

While the debate continues in many forums, featuring benchmarks generally with the "elder brother" C++, there is some truth in it; that is today (some time ago), many Java applications are still awfully slow. Why?

What happened is that, ironically, even if Sun engineers were able to deliver faster JVMs release after release, programming Java Enterprise applications became more and more complex, and therefore so did writing fast Java applications.

Not so long ago the archetype of a Java Application was made up of a Front Layer (usually developed with JSPs or Swing) and some Middleware, usually developed with a mix of Servlets and Data Access Objects (DAO) that contained the interfaces for the legacy system.

In such a scenario, the architect had to take care of fewer counters and there was only one, or perhaps two protocols involved in the communications (HTTP and RMI). With a minimal application and web server tuning along with some DBA tips you could bring home the desired result.

Today’s enterprise applications are much more complex; take for example the input: it can come from HTML as well as a thick client or a web service, or even a mobile device. Also, lots of Java programming interfaces have been screened by other frameworks to simplify or enhance the productivity of the developer. For example, Java Server Faces (JSF) specification has been built on the top of Servlet/JSPs and then custom libraries (like RichFaces) have been built on the top of JSF. Another good example is the Hibernate framework, which has been built on the top of JDBC, and then Entities have been built on the top of Hibernate.

We might continue discussing other good examples, however the truth is that each of these extra layers inevitably carry some overhead, and have their own best practices which are usually unknown to the majority of developers.

Our conclusion is that today Java applications have a higher performance potential than they once did, but this needs expert hands and a solid tuning methodology to be allowed in the Eden where fast applications live.

Nevertheless, tuning Java Enterprise applications is more complex than standalone applications as it requires monitoring and configuring additional components like the application server, which acts as a container for the application, and all resources which are directly controlled by the application server. In the next section, we are going to explore all the single areas which have an impact on the performance of an enterprise application.

Areas of tuning

Configuration and tuning settings can be divided into four main categories:

• Java Virtual Machine (JVM) tuning

• Middleware tuning

• Application tuning

• Operating system / Hardware tuning

Let’s enter more in detail in each area:

• JVM tuning: Every Java application runs in a Virtual Machine, so with proper configuration of JVM parameters (in particular those related to memory and garbage collection), it’s possible to achieve better performance of your Java applications. The configuration of JVM has changed a lot since the first releases of Java, and most developers are not aware that the default JVM parameters are usually not optimal for running large applications.

• Middleware tuning is managed to control how an application server provides services for running applications and their components. The application server is pretty complex stuff and at the same time, a fertile ground for optimizations for expert users. The application server contains a core configuration that is common to all applications (think about the pool of thread which is responsible for invoking other components), and also a set of Java EE services which are available for use (like EJB, the web container, JMS, and so on). Each of these services has a default configuration which can be just as good for average applications, but need to be tweaked in order to obtain superior performance.

• Application tuning requires that you write efficient code in your application, as well as adopt the best performing libraries to achieve the desired task.

Most tuning experts agree that application tuning accounts for about 75% of the overall tuning process. This doesn’t mean that hardware and correct administration configuration is useless. The truth is that even the best hardware and application server configuration will not provide dramatic performance numbers if you are running a poorly coded application. Just to mention a few:

° Are you using queries without index on the where fields?

° Are you gathering massive data in the HTTP session?

° Are you issuing a select * and trying to cache all the data in the middle tier

If you are performing any of these mistakes then there is little you can fix with proper JVM configuration or application server tuning alone.

• Operating system tuning relates to configuring your system and hardware resources so that they can efficiently run the software resources discussed previously. The most common hardware tuning is concerned with physical memory: if you determine that your application has a memory bottleneck, and it’s not caused by inefficient coding, you have no other choice but to add more memory to your machine(s).

Another hot point for tuning hardware is CPU: each application that runs on a server gets a time slice of the CPU. The CPU might be able to efficiently handle all of the processes running on the computer, or it might be overloaded. By examining processor activity and the activity of individual processes including thread creation, thread switching, context switching, and so on, you can gain good insight into processor workload and performance. Again, if the CPU is the bottleneck and it cannot be solved by application tuning, you have to consider adding more CPUs or splitting the load on an array of servers.

• Hardware tuning also includes input/output tuning. Executing long-running file I/O operations, data encryption and decryption, or reading too much data from database tables can turn I/O operation into a serious bottleneck. A shortage of physical memory might also lead to an excessive input-output activity if the data cannot fit in the physical memory. Slow hard disks are another factor to consider and are the only possible solution if you still have disk I/O bottlenecks after optimizing all other factors.

The last hardware component we need to mention is the Network, which is the means by which different applications communicate. Tuning the network means shortening the number of hops your application needs to do in order to reach external systems. You also need to configure your protocol transmission in the most efficient way so that in turn your packets are routed in the most efficient way. Again, if you still have a bottleneck in this area, the last solution is to upgrade to a new set of network devices.

Is it possible to optimize all areas of tuning?

Theoretically yes, but in practice, optimization will generally focus on improving just one or two aspects of performance: for example execution time, memory usage, disk space, bandwidth, power consumption, or some other resource. This will usually require a trade-off where one factor is optimized at the expense of others. For example, increasing the size of cache improves runtime performance, but also increases the memory consumption. Other common trade-offs include code clarity and conciseness. In practice you have to define some priorities and code accordingly.

The following image synthesizes the concepts we have just covered:

Summary

In this topic, you have learnt the basics of the performance tuning process: let’s shortly recap the most significant points:

• Performance can be evaluated with two main counters: The Response Time and the Throughput. The Response Time can be defined as the time it takes for one user to perform a task. The Throughput is the number of transactions that can occur in a given amount of time.

• In order to meet higher loads, applications need to be scalable. You can scale your applications vertically (that is switching to servers with higher capabilities) or horizontally (that is adding a line of servers).

• In order to improve the performance of your applications, you have to consider all resources which are around the application: the Java Virtual Machine, the middleware, the hardware, and how you code the application itself.

• Maximum application performance can be achieved only if performance tuning is considered to be a part of your overall software development plan.

In the next topic, we are going to introduce a few essential tools, which you can freely download, and use in order to tune your Enterprise applications and your operating system as well.