Tuning the JBoss AS Part 3

Using Prepared Statements efficiently

When a database receives a statement, the database engine first parses the SQL string and looks for syntax errors. Once the statement is parsed, the database needs to figure out the most efficient plan to execute the statement. This can be computationally quite expensive. Once the query plan is created, the database engine can execute it.

Ideally, if we send the same statement to the database twice, then we’d like the database to reuse the access plan for the first statement. This uses less CPU than if it regenerated the plan a second time. In Java, you can obtain a good performance boost by using Prepared Statements and, instead of concatenating the parameters as a string, by using markers:

tmp39-125_thumb[2]_thumb

This allows the database to reuse the access plans for the statement and makes the program execute more efficiently inside the database. This basically lets your application run faster or makes more CPU available to users of the database.

Prepared Statements can be cached by the application server itself when it’s necessary to issue the same statements across different requests. Enabling the Prepared Statements cache is quite simple; all you have to do is insert the following fragment, with the desired cache size in your datasource file:


tmp39-126_thumb[2]_thumb

This attribute indicates the number of Prepared Statements per connection to be kept open and reused in subsequent requests.

In practice, JBoss AS keeps a list of Prepared Statements for each database connection in the pool. When an application prepares a new Statement on a connection, the application server checks if that statement was previously used. If it was, the Preparedstatement object will be recovered from the cache and this will be returned to the application. If not, the call is passed to the JDBC driver and the query/Preparedstatement object is added in that connections cache.

The cache used by Prepared Statements is an LRU cache (Least Recentely Used). This means that statements are moved at the bottom of the cache until newer statements replace them.

tmp39-127_thumb[2][2]

If you don’t add this attribute, the default is 0 (zero), meaning no cache.

The performance benefit provided by the Prepared Statement cache is very application-specific, varying from 10% up to 30% for an application using a mix of different statements, Prepared Statements, and Callable Statements.

In the following benchmark we have tried to reproduce a similar scenario, scoring a consistent 20% gain for an application using a correctly sized Prepared Statement cache:

tmp39-128

So, you might wonder, how do we set the optimal cache size? The rigorous approach is to check how many unique Statements are sent to the DB. If you haven’t planned to include SQL logging in your application, then you can consider using some third-party projects like P6Spy, which intercepts calls issued to the database driver, and logs the statement along with useful timing statistics.

On the author’s blog you can find a complete tutorial about tracing SQL statements with P6Spy and JBoss AS: http://www.mastertheboss.com/jboss-server/259-how-to-trace-jdbc-statements-with-jboss-as.html.

Two things to be aware of:

1. Prepared statements are cached per connection. The more connections you have, the more Prepared Statements you get (even when they are the same query). So use them with parsimony and don’t simply guess how many are needed by your application.

2. When the connection pool shrinks because the idle timeout for a connection expires, Statements are removed from the pool of cached Prepared Statements. This can cause an overhead, which outweighs the benefit of caching statements.

One good compromise that I have tried on a few projects is to create two DataSources for your application, a larger one (let’s say with up to 30 connections) with no Prepared Statement cache, and a smaller one with a Prepared Statement cache activated and min-pool-size equal to max-pool-size, in order to avoid any pool shrink. Here’s a sample DataSource, which can be used as a template:

tmp39-129_thumb[2][2]

Detecting connection leaks

One of the first things you learn about Java Enterprise classes is that every time you have ended with your connection, you should close it (along with the Statements and ResultSet) in a finally block; this way you’ll avoid running into a connection leak if the method has raised an exception:

tmp39-130

Connection leaks, like memory leaks, are not a direct cause of a performance loss but an inefficient use of system resources, which is often solved by allocating extra resources. Unfortunately system resources are not infinite so, if you need to allocate extra memory to cope with your leaks, there will be less resources available for your application.

Luckily, detecting connection leaks is quite easy and can be done by means of the CachedConnectionManager MBean which is defined in the <server>/deploy/jca-jboss-beans.xml file.

tmp39-131_thumb[2][2]

By setting the Debug attribute to true, the unclosed connections will be monitored by JBoss AS, which will issue this message on the console if you missed closing a connection:

tmp39-132_thumb[2][2]

Please note that the class just below the method WrapperDataSource. getConnection in the Stacktrace: that’s where you forgot to close the connection. (Precisely, in the jspService method of connectionLeak.jsp line 77).

If you need the count of unclosed connections along with their Stacktrace, you can then use the listinUseConnections() method from the CachedConnectionManager MBean which can be invoked from the JMX-console.

tmp39-133

Closing connections is fundamental for keeping your application server healthy; however, Statements and ResultSets, that are instantiated during the connection need to be closed in the same way.

By adding the following fragment to your -ds.xml, the application server takes care of closing Statements and ResultSet if you miss it:

tmp39-134_thumb[2][2]

Use these helper statements only in the development stage!

Checking that your database interfaces are not causing any leak is extremely useful, however you need to know that rolling your applications in production with these flags activated is detrimental for the application performance. The same stands for <new-connection-sql> that executes a SQL statement every time a new connection is created or <check-valid-connection-sql> that checks if the connection from the pool is valid or it has gone stale.

Be sure to leaf off all of these statements when you are going for production if you don’t want to pay a performance penalty.

Should you reuse connections or acquire new ones from the pool?

One of the most common misconceptions among Java developers is that acquiring and closing connections continuously from the pool stresses excessively your application server, so it’s not rare to see classes like DatabaseManager which hold static Connection fields to be shared across the application.

If you are curious to know how much it costs (in terms of milliseconds) to acquire a connection from the pool and release it, the following code will show you:

tmp39-135_thumb[2][2]

This test, executed on a Pentium 4 Dual Core 2.80 Ghz, reveals an unexpected low 0.2 milliseconds cost to check-in / check-out a connection from the server pool.

Of course this doesn’t mean you need to abuse the connection pool when it’s not necessary, for example, it’s always good to reuse the same connection and statements if you are performing a set of database operation within the same thread.

However sharing your connection instance across several business methods is not a good idea as it can easily confuse who is using these classes and it can easily end up with connection leaks. So keep your code simple and tidy and don’t be afraid to open and close your connections in every business method.

Next post:

Previous post: