Java Reference
In-Depth Information
Processing extremely large data sets
Occasionally you may find that requirements for an application seem to make the
use of large data sets look like the right choice. In most cases, those requirements
can be met in other ways, and in those cases, some probing questions can uncover
the real need.
For instance, suppose you are handed a requirement document dictating that
the entire contents of a 30,000-row data table be output as HTML for users to
browse. The first question to ask is, “Do you really need all of that data?” It's our
experience that there is no “Yes” answer from a logical and defensible position.
While we do not doubt that the users do need to see all of that data, we do question
whether they need to see all of that data at one time . In almost every case, a filter that
limits what is returned will work better for your users than a “fire hose” report that
just dumps the output of a “select * from table” type of query onto a screen.
If the results are not required for output but are required for processing, you
should seriously consider whether a stored procedure would work better for it
than Java code. Although stored procedures are often viewed as the anathema of
the “Write once, run anywhere” goal of Java, we have seen cases where an applica-
tion that took 10-15 minutes to run as pure Java ran in under 10 seconds with a
stored procedure. The users of that system do not care about the purity of the
application; they care about being able to use it to get their job done.
No more dodging the question...
So, now that we have tried to avoid the issue of dealing with massive data sets, and
decided that we really do have to deal with them, let's look at what i BATIS provides
to handle them: the RowHandler interface was created just for these cases.
The RowHandler interface is a simple one that allows you to insert behavior into
the processing of a mapped statement's result set. The interface has only one
public interface RowHandler {
void handleRow(Object valueObject);
The handleRow method is called once for each row that is in the result set of a
mapped statement. Using this interface, you are able to handle very large
amounts of data without loading it all into memory at one time. Only one row of
the data is loaded into memory, your code is called, that object is discarded, and
the process repeats until the results have all been processed.
Search WWH ::

Custom Search