Spring Batch - Spring Enterprise Recipes: A Problem Solution Approach

Java Reference

In-Depth Information

The pieces are there, however: transaction support, fast I/O, schedulers such as Quartz and solid

threading support, and a very powerful concept of an application container in Java EE and Spring. It was

only natural that Dave Syer and his team would come along and build Spring Batch, a batch processing

solution for the Spring platform.

It's important to think about the kinds of problems this framework solves before diving into the

details. A technology is defined by its solution space. A typical Spring Batch application typically reads in

a lot of data and then writes it back out in a modified form. Decisions about transactional barriers, input

size, concurrency, and order of steps in processing are all dimensions of a typical integration.

A common requirement is loading data from a comma-separated value (CSV) file, perhaps as a

business-to-business (B2B) transaction; perhaps as an integration technique with an older legacy

application. Another common application is nontrivial processing on records in a database. Perhaps

the output is an update of the database record itself. An example might be resizing of images on the

file system whose metadata is stored in a database, or needing to trigger another process based on

some condition.

■

Note Fixed-width data is a format of rows and cells, quite like a CSV file. CSV file cells are separated by

commas or tabs, however, and fixed-width data works by presuming certain lengths for each value. The first value

might be the first nine characters, the second value the next four characters after that, and so on.

Fixed-width data, which is often used with legacy or embedded systems, is a fine candidate

for batch processing. Processing that deals with a resource that's fundamentally nontransactional

(for example, a web service or a file) begs for batch processing because batch processing provides

retry/skip/fail functionality that most web services will not.

It's also important to understand what Spring Batch doesn't do. Spring Batch is a flexible but not

all-encompassing solution. Just as Spring doesn't reinvent the wheel when it can be avoided, Spring

Batch leaves a few important pieces to the discretion of the implementor. Case in point: Spring Batch

provides a generic mechanism by which to launch a job, be it by the command line, a Unix cron , an

operating system service, Quartz (discussed in Chapter 6), or in response to an event on an enterprise

service bus (for example, the Mule ESB or Spring's own ESB-like solution, Spring Integration, which is

discussed in Chapter 8). Another example is the way Spring Batch manages the state of batch processes.

Spring Batch requires a durable store. The only useful implementation of a JobRepository (an interface

provided by Spring Batch for storing runtime data) requires a database because a database is

transactional and there's no need to reinvent it. To which database you should deploy, however, is

largely unspecified, although there are useful defaults provided for you, of course.

Runtime Meta Model

Spring Batch works with a JobRepository , which is the keeper of all the knowledge/metadata for each

job (including component parts such as JobExecution and StepExecution ). Each job is composed of

one or more step s, one after another. With Spring Batch 2.0, a step can conditionally follow another

step , allowing for primitive workflows. These step s can also be concurrent: two step s can run at the

same time.

When a job is run, it's often coupled with JobParameter s to parameterize the behavior of the job

itself. For example, a job might take a date parameter to determine which records to process. This

coupling is called a JobInstance . A JobInstance is unique because of the JobParameter s associated

with it. Each time the same JobInstance (i.e., the same job and JobParameter s) is run, it's called a

Spring Enterprise Recipes: A Problem Solution Approach

Search WWH ::

Custom Search

Home