Spring Batch - Spring Enterprise Recipes: A Problem Solution Approach

Java Reference

In-Depth Information

Data Loads and Data Warehouses

In this example, I didn't tune the table at all. For example, there are no indexes on any of the columns

besides the primary key. This is to avoid complicating the example. Great care should be taken with a table

like this one in a nontrivial, production-bound application,.

Spring Batch applications are workhorse applications and have the potential to reveal bottlenecks in your

application you didn't know you had. Imagine suddenly being able to achieve 1 million new database

insertions every 10 minutes. Would your database grind to a halt? Insert speed can be a critical factor in

the speed of your application. Software developers will (hopefully) think about schema in terms of how well

it enforces the constraints of the business logic and how well it serves the overall business model.

However, it's important to wear another hat, that of a DBA, when writing applications such as this one. A

common solution is to create a denormalized table whose contents can be coerced into valid data once

inside the database, perhaps by a trigger on inserts. This is a common technique in data warehousing.

Later, you'll explore using Spring Batch to do processing on a record before insertion. This lets the

developer verify or override the input into the database. This processing, in tandem with a conservative

application of constraints that are best expressed in the database, can make for applications that are very

robust and quick.

The Job Configuration

The configuration for the job is as follows:

<job

job-repository="jobRepository"

id="insertIntoDbFromCsvJob">

<chunk

reader="csvFileReader"

writer="jdbcItemWriter"

commit-interval="5"

/>

</tasklet>

</step>

</job>

As described earlier, a job consists of step s, which are the real workhorse of a given job . The step s

can be as complex or as simple as you like. Indeed, a step could be considered the smallest unit of work

for a job . Input (what's read) is passed to the Step and potentially processed; then output (what's

written) is created from the step . This processing is spelled out using a Tasklet . You can provide your

own Tasklet implementation or simply use some of the preconfigured configurations for different

processing scenarios. These implementations are made available in terms of subelements of the Tasklet

element. One of the most important aspects of batch processing is chunk-oriented processing, which is

employed here using the chunk element.

Search WWH ::

Custom Search

Home