The first approach to achieving parallelization is the division of work via multithreaded steps. In Spring
Batch, a job is configured to process work in blocks called chunks , with a commit after each block.
Normally, each chunk is processed in series. If you have 10,000 records, and the commit count is set at
50 records, your job will process records 1 to 50 and then commit, process 51 to 100 and commit, and so
on, until all 10,000 records have been processed. Spring Batch allows you to execute chunks of work in
parallel to improve performance. With three threads, you can increase your throughput threefold, as
shown in Figure 2-3. 1
Figure 2-3. Multithreaded steps
The next approach you have available for parallelization is the ability to execute steps in parallel, as
shown in Figure 2-4. Let's say you have two steps, each of which loads an input file into your database;
but there is no relationship between the steps. Does it make sense to have to wait until one file has been
loaded before the next one is loaded? Of course not, which is why this is a classic example of when to use
the ability to process steps in parallel.
Figure 2-4. Parallel step processing
1 This is a theoretical throughput increase. Many factors can prevent the ability of a process to achieve linear
parallelization like this.