Distributed Spring - Spring Enterprise Recipes: A Problem Solution Approach

Java Reference

In-Depth Information

On the Java landscape, this problem is even more pronounced because of Java's difficulty in

addressing large amounts of RAM (anecdotally, 2GB to 4GB is about the max a single JVM can usefully

address). There are garbage collectors in the works that seek to fix some of these issues, but the fact

remains that a single computer can have far more RAM than a single JVM could ever usefully deal with.

Parallelization is a must. Today, more and more enterprises are deploying entire virtualized operating

system stacks on one server simply to isolate Java applications and fully exploit the hardware.

Thus, distribution isn't just a function of resilience or capability; it's a function of common-sense

investing.

There are costs to parallelization, as well. There's always going to be some constraint, and very

rarely is an entire system equally scalable. The cost of coordinating state between nodes, for example,

might be too high because the network or hard disks impose latency. There are also other constraints.

Notably, not all operations are parallelizable. It's important to design systems with this in mind. An

example might be the overall processing of a person's uploaded photos (as happens in many web sites

today). You might take the moment at which they upload the batch, to the moment a process has

watermarked them and added them to an online photo album and measure the time during which the

whole process is executed serially. Some of these steps are not parallelizable. The one part that is, the

watermarking, will only lead to a fixed increase, and little can be done beyond that.

You can describe these gains. Amdahl's law, also known as Amdahl's argument, is a formula to find

the maximum expected improvement to an overall system when only part of the system is improved. It

is shown here:

It describes the relationship between a solutions execution time when serially executed and when

executed in parallel with the same problem set. Thus, for 90 photos, if we know that it takes a minute for

each photo, and that uploading takes 5 minutes, and that posting the resulting photos to the repository

takes 5 minutes, the total time is 100 minutes when executed serially. Let's assume we add 9 workers to

the watermarking process, for a total of 10 processes that watermark. In the equation, P is the portion of

the process that can be parallelized, and N is the factor by which that portion might be parallelized (that

is, the number of workers, in this case). For the process described, 90% of the process can be

parallelized: each photo could be given to a different worker, which means it's parallelizable, which

means that 90% of the serial execution is parallelizable. If you have 10 nodes working together, the

equation is: 1/((1-.9) + (.9 / 10)), or 5.263. So, with 10 workers, the process could be 5x faster. With

100 workers, the equation yields 9.174, or 9x faster. It may not make sense to continue adding nodes as

you'll achieve increasingly smaller gains.

Building an effective distributed solution, then, is an application of cost/benefit analysis. Spring has

no direct support for distributed paradigms, per se , because plenty of other solutions do a great job

already. Often, these solutions make Spring integration a first priority because it's a de-facto standard.

In some cases, these projects forwent their own configuration format and use Spring itself as the

configuration mechanism. If you decide to employ distribution, you'll be glad to know that there are

many projects designed to meet the call, whatever it may be.

In this chapter, we discuss a few solutions that are Spring-friendly and ready. A lot of these solutions

are possible because of Spring's support for “components,” such as it's XML schema support and

runtime class detection. These technologies often require you to change your frame of mind when

building solutions, even if ever so slightly, as compared to solutions built using JEE, but being able to

rely on your Spring skills is powerful. Other times, these solutions may not even be visible, except as

configuration. Further still, a lot of these solutions expose themselves as standard interfaces familiar to

JEE developers, or as infrastructure (such as, for example, backing for an HTTP session, or as a cluster-

ready message queue) that goes unnoticed and isolated, except at the configuration level, thanks to

Spring's dependency injection.

Spring Enterprise Recipes: A Problem Solution Approach

Search WWH ::

Custom Search

Home