SHOULD YOU PRE-ENCODE STRINGS?
Application servers spend a lot of time doing character conversion: converting from Java String
objects (which are stored in UTF-16 format) to whatever bytes the client is expecting. Many of
these strings are always the same. The HTML strings in a web page don't usually change based
on the data (and if they do, they are still drawn from a constant set of strings).
Should these strings be pre-encoded into a byte array so that the byte array can be reused? The an-
swer depends on the application server and application.
The HTML strings in JSP pages are written by the application server. Whether or not those strings
are pre-encoded depends on the server; some servers offer this ability as an option, and some do it
In a servlet, those strings can be pre-encoded and sent to the network using the write() method
of a ServletOutputStream rather than using the print() method of a PrintWriter . Dynamic
data, though, still needs to go through the print() method so that it is correctly encoded for the
target. (You can figure out the target encoding from the headers and encode the string yourself,
but that is relatively error prone.)
There are important differences in the way application servers implement those output interfaces
and where the application server is buffering data within them. In some servers, mixing calls
between the servlet output stream and its companion print writer causes network buffers to be
flushed frequently. Frequent flushing of the buffers is very expensive in terms of perform-
ance—more expensive than re-encoding the data would be. Similarly, encoding a large block of
data is usually not much more expensive than encoding a small block of data: setting up the call
to the encoders is what is expensive in the first place. Hence, frequent calls to the encoder for
small pieces of dynamic data interleaved with calls to send the pre-encoded byte arrays may slow
down an application: the many calls to the encoder will take longer to execute than one call to en-
code everything (including the static data).
There are cases where the pre-encoding in your code can help. But it is definitely a your-mileage-
These optimizations can make a significant difference in real-world (as opposed to laborat-
ory) performance. Table 10-1 shows the kind of results that might be expected. Tests were
run using the long-output form of the stock history servlet, retrieving data for a 10-year peri-
od. This results in an uncompressed, untrimmed HTML page of about 100 KB. To minimize
the effect of bandwidth considerations, the test ran only a single user with a 100 ms think
time and measured the average response time for requests. In the LAN case, tests were run
over a local network with a 100 MB switch connecting components; in the broadband case,