Scaling and Tuning - Spring Batch

Java Reference

In-Depth Information

The partition method obtains the min and the max ids in the Customers table (you configure the

table name and column name in the XML). From there, you divide that range based on the number of

slaves you have and create a StepExecution with the min and max id to be processed. When all the

StepExecutions are created and saved in the Map , you return the Map .

ColumnRangePartitioner is the only new class you need to write to use partitioning in your job. The

other required changes are in the configuration. Before you look at the configuration, let's talk about

how the flow of the messages occurs. Figure 11-19 shows how each message is processed.

outbound-

requests

inbound-

requests

requestsQueue

StepRequests

Execution

Handler

Message

Channel

PartitionHandler

outbound-

replies

inbound-

staging

outbound-

staging

Queue

Figure 11-19. Message processing with a partitioned job

The job is configured with a new type of step. Up to this point, you've been using tasklet steps to

execute your code. For a step to be partitioned, you use a partition step . This type of step, unlike a tasklet

step that configures a chunk of processing, configures how to partition a step (via the Partitioner

implementation) and a handler that is responsible for sending the messages to the slaves and receiving

the responses.

 Note Communication with remote workers in partitioned processing doesn't need to be transactional or have

guaranteed delivery.

The Spring Batch Integration project provides an implementation of the PartitionHandler interface

called MessageChannelPartitionHandler . This class uses Spring Integration channels as the form of

communication to eliminate any dependencies on a particular type of communication. For this

example, you use JMS. The communication consists of three queues: a request queue for the master step

to send out the work requests, a staging queue on which the slave steps reply, and a reply queue to send

the consolidated reply back to the master. There are two queues on the way back because each step

replies with the StepExecution from that step. You use an aggregator to consolidate all the responses

into a single list of StepExecutions so they can be processed at once.

Let's look at the configuration for geocodingJob using partitioning; see Listing 11-30.

Search WWH ::

Custom Search

Home