Figure 4-6 shows how data flows through a batch process when designed for chunk processing. Here
you see that although each row is still read and processed individually, all the writing for a single chunk
occurs at once when it's time to be committed. This small tweak in processing allows for large
performance gains and opens up the world to many other processing capabilities.
for each chunk
for each item
Figure 4-6. Chunk-based processing
One of the things that chunk-based processing allows you to do is to process chunks remotely.
When you consider things like networking overhead, it's cost prohibitive to process individual items
remotely. However, if you can send over an entire chunk of data at once to a remote processor, then
instead of making performance worse, it can improve performance dramatically.
As you learn more about steps, readers, writers, and scalability throughout the topic, keep in mind
the chunk-based processing that Spring Batch is based on. Let's move on by digging into how to
configure the building blocks of your jobs: steps.
By now, you've identified that a job is really not much more than an ordered list of steps to be executed.
Because of this, steps are configured by listing them within a job. Let's examine how to configure a step
and the various options that are available to you.
When you think about steps in Spring Batch, there are two different types: a step for chunk-based
processing and a tasklet step. Although you used a tasklet step previously in the “Hello, World!” job, you
see more detail about it later. For now, you start by looking at how to configure chunk-based steps.
As you saw earlier, chunks are defined by their commit intervals. If the commit interval is set to 50
items, then your job reads in 50 items, processes 50 items, and then writes out 50 items at once. Because
of this, the transaction manager plays a key part in the configuration of a chunk-based step. Listing 4-23
shows how to configure a basic step for chunk-oriented processing.