Job Repository and Metadata - Spring Batch

Java Reference

In-Depth Information

C H A P T E R 5

Job Repository and Metadata

When you look into writing a batch process, the ability to execute processes without a UI in a stand-

alone manner isn't that hard. When you dig into Spring Batch, the execution of a job amounts to nothing

more than using an implementation of Spring's TaskExecutor to run a separate task. You don't need

Spring Batch to do that.

Where things get interesting, however, is when things go wrong. If your batch job is running and an

error occurs, how do you recover? How does your job know where it was in processing when the error

occurred, and what should happen when the job is restarted? State management is an important part of

processing large volumes of data. This is one of the key features that Spring Batch brings to the table.

Spring Batch, as discussed previously in this topic, maintains the state of a job as it executes in a job

repository. It then uses this information when a job is restarted or an item is retried to determine how to

continue. The power of this feature can't be overstated.

Another aspect of batch processing in which the job repository is helpful is monitoring. The ability

to see how far a job is in its processing as well as trend elements such as how long operations take or

how many items were retried due to errors is vital in the enterprise environment. The fact that Spring

Batch does the number gathering for you makes this type of trending much easier.

This chapter covers job repositories in detail. It goes over ways to configure a job repository for most

environments by using either a database or an in-memory repository. You also look briefly at

performance impacts on the configuration of the job repository. After you have the job repository

configured, you learn how to put the job information stored by the job repository to use using the

JobExplorer and the JobOperator.

Configuring the Job Repository

In order for Spring Batch to be able to maintain state, the job repository needs to be available. Spring

offers two options by default: an in-memory repository and a persisted repository in a database. This

section looks at how to configure each of those options as well as the performance impacts of both

options. Let's start with more simpler option, the in-memory job repository.

Using an In-Memory Job Repository

The opening paragraphs of this chapter laid out a list of benefits for the job repository, such as the ability

to maintain state from execution to execution and trend run statistics from run to run. However, you'll

almost never use an in-memory repository for those reasons. That's because when the process ends, all

of that data is lost. So, why would you use an in-memory repository at all?

The answer is that sometimes you don't need to persist the data. For example, in development, it's

common to run jobs with an in-memory repository so that you don't have to worry about maintaining

the job schema in a database. This also allows you to execute the same job multiple times with the same

Search WWH ::

Custom Search

Home