Java Reference
In-Depth Information
Step using partitioning
Regular Step
Master Step
Regular Step
Figure 11-18. A partitioned job
As you can see, the master job step is responsible for dividing the work into partitions to be
processed by each of the slaves. It then sends a message consisting of a StepExecution to be consumed
by the slaves; this describes what to process. Unlike remote chunking, where the data is sent remotely,
partitioning only describes the data to be processed by the slave. For example, the master step may
determine a range of database ids to process for each partition and send that out. Once each slave has
completed the work requested, it returns the StepExecution, updated with the results of the step for the
master to interpret. When all the partitions have been successfully completed, the step is considered
complete, and the job continues. If any of the partitions fail, the step is considered failed, and the job
To look at how partitioning works in a job, let's reuse the geocoding job you used in the remote-
chunking example, but refactor it to use partitioning. Its single step is now executed remotely in a
number of JVMs. Because most of the code is the same, let's start by looking at the one new class that
partitioning requires: an implementation of the Partitioner interface.
The interface has a single
method, partition(int gridSize) , which returns a Map of partition names as the keys and a
StepExecution as the value. Each of the StepExecutions in the Map contains the information the slave
steps need in order to know what to do. In this case, you store two properties in the StepExecution for
each slave: the start id for the customers to process and an end id. Listing 11-29 shows the code for
ColumnRangePartitioner .
Listing 11-29. ColumnRangePartitioner
package com.apress.springbatch.chapter11.partition;
import java.util.HashMap;
import java.util.Map;
import org.springframework.batch.item.ExecutionContext;
Search WWH ::

Custom Search