Java Reference
In-Depth Information
Batch Challenges
You're undoubtedly familiar with the challenges of GUI-based programming (thick clients and web apps
alike). Security issues. Data validation. User-friendly error handling. Unpredictable usage patterns
causing spikes in resource utilization (have a blog post of yours show up on the front page of Slashdot to
see what I mean here). All of these are byproducts of the same thing: the ability for users to interact with
your software.
However, batch is different. I said earlier that a batch process is a process that can run without
additional interaction to some form of completion. Because of that, most of the issues with GUI
applications are no longer valid. Yes, there are security concerns, and data validation is required, but
spikes in usage and friendly error handling either are predictable or may not even apply to your batch
processes. You can predict the load during a process and design accordingly. You can fail quickly and
loudly with only solid logging and notifications as feedback, because technical resources address any
So everything in the batch world is a piece of cake and there are no challenges, right? Sorry to burst
your bubble, but batch processing presents its own unique twist on many common software
development challenges. Software architecture commonly includes a number of ilities . Maintainability.
Usability. Scalability. These and other ilities are all relevant to batch processes, just in different ways.
The first three ilities—usability, maintainability, and extensibility—are related. With batch, you
don't have a user interface to worry about, so usability isn't about pretty GUIs and cool animations. No,
in a batch process, usability is about the code: both its error handling and its maintainability. Can you
extend common components easily to add new features? Is it covered well in unit tests so that when you
change an existing component, you know the effects across the system? When the job fails, do you know
when, where, and why without having to spend a long time debugging? These are all aspects of usability
that have an impact on batch processes.
Next is scalability. Time for a reality check: when was the last time you worked on a web site that
truly had a million visitors a day? How about 100,000? Let's be honest: most web sites developed in large
corporations aren't viewed nearly that many times. However, it's not a stretch to have a batch process
that needs to process 100,000 to 500,000 transactions in a night. Let's consider 4 seconds to load a web
page to be a solid average. If it takes that long to process a transaction via batch, then processing 100,000
transactions will take more than four days (and a month and a half for 1 million). That isn't practical for
any system in today's corporate environment. The bottom line is that the scale that batch processes
need to be able to handle is often one or more orders of magnitude larger than that of the web or thick-
client applications you've developed in the past.
Third is availability. Again, this is different from the web or thick-client applications you may be
used to. Batch processes typically aren't 24/7. In fact, they typically have an appointment. Most
enterprises schedule a job to run at a given time when they know the required resources (hardware,
data, and so on) are available. For example, take the need to build statements for retirement accounts.
Although you can run the job at any point in the day, it's probably best to run it some time after the
market has closed so you can use the closing fund prices to calculate balances. Can you run when you
need to? Can you get the job done in the time allotted so you don't impact other systems? These and
other questions affect the availability of your batch system.
Finally you must consider security. Typically, in the batch world, security doesn't revolve around
people hacking into the system and breaking things. The role a batch process plays in security is in
keeping data secure. Are sensitive database fields encrypted? Are you logging personal information by
accident? How about access to external systems—do they need credentials, and are you securing those
in the appropriate manner? Data validation is also part of security. Generally, the data being processed
has already been vetted, but you still should be sure that rules are followed.
As you can see, plenty of technological challenges are involved in developing batch processes. From
the large scale of most systems to security, batch has it all. That's part of the fun of developing batch
Search WWH ::

Custom Search