Spring Batch 101 - Spring Batch

Java Reference

In-Depth Information

Remote Chunking

The last two approaches to parallelization allow you to spread processing across multiple JVMs. In all

cases previously, the processing was performed in a single JVM, which can seriously hinder the

scalability options. When you can scale any part of your process horizontally across multiple JVMs, the

ability to keep up with large demands increases.

The first remote-processing option is remote chunking . In this approach, input is performed using a

standard ItemReader in a master node; the input is then sent via a form of durable communication (JMS

for example) to a remote slave ItemProcessor that is configured as a message driven POJO. When the

processing is complete, the slave sends the updated item back to the master for writing. Because this

approach reads the data at the master, processes it at the slave, and then sends it back, it's important to

note that it can be very network intensive. This approach is good for scenarios where the cost of I/O is

small compared to the actual processing.

Partitioning

The final method for parallelization within Spring Batch is partitioning, shown in Figure 2-5. Again, you

use a master/slave configuration; but this time you don't need a durable method of communication, and

the master serves only as a controller for a collection of slave steps. In this case, each of your slave steps

is self-contained and configured the same as if it was locally deployed. The only difference is that the

slave steps receive their work from the master node instead of the job itself. When all the slaves have

completed their work, the master step is considered complete. This configuration doesn't require

durable communication with guaranteed delivery because the JobRepository guarantees that no work is

duplicated and all work is completed—unlike the remote-chunking approach, in which the

JobRepository has no knowledge of the state of the distributed work.

Step 2

Master

Step 1

Step 2

Slave

Step 2

Slave

Step 2

Slave

Figure 2-5. Partitioning work

Batch Administration

Any enterprise system must be able to start and stop processes, monitor their current state, and even

view results. With web applications, this is easy: in the web application, you see the results of each action

you request, and tools like Google Analytics provide various metrics on how your application is being

used and is performing.

However, in the batch world, you may have a single Java process running on a server for eight hours

with no output other than log files and the database the process is working on. This is hardly a

manageable situation. For this reason, Spring has developed a web application called Spring Batch

Admin that lets you start and stop jobs and also provides details about each job execution.

Search WWH ::

Custom Search

Home